Hyperthermostable protease gene

ABSTRACT

There is disclosed a hyperthermostable protease gene originating in Pyrococcus furiosus, in particular, a hyperthermostable protease gene encoding the amino acid sequence represented by the SEQ ID NO 1 in the Sequence Listing or a part thereof which retains the activity of the hyperthermostable protease. There is also disclosed a process for producing the protease by culturing a transformant transformed with a plasmid into which the above gene has been inserted.

FIELD OF THE INVENTION

The present invention relates to a gene encoding a hyperthermostableprotease which is useful as an enzyme for industrial application and aprocess for producing the enzyme by genetic engineering.

BACKGROUND OF THE INVENTION

Proteases are enzymes which cleave peptide bonds in proteins and variousproteases have been found in animals, plants and microorganisms. Theyare used not only as reagents for research works and medical supplies,but also in industrial fields such as additives for detergents, foodprocessing and chemical syntheses utilizing their reverse reactions andit can be said that they are very important enzymes from an industrialviewpoint. For proteases to be used in industrial fields, since veryhigh physical and chemical stabilities are required, in particular,enzymes having high thermostability are preferred to use. At present,proteases predominantly used in industrial fields are those produced bybacteria of the genus Bacillus because they have relatively highthermostabilities.

However, enzymes having further superior properties are desired andactivities have been attempted to obtain enzymes from microorganismswhich can grow at high temperatures, for example, thermophiles of thegenus Bacillus.

On the other hand, a group of microorganisms, named ashyperthermophiles, are well adapted themselves to high temperatureenvironment and therefore they are expected to be supply sources forvarious thermostable enzymes. It has been known that one of thesehyperthermophiles, Pyrococcus furiosus, produces proteases Appl.Environ. Microbiol., 56, 1992-1998 (1990); FEMS Microbiol. Letters, 71,17-20 (1990); J. Gen. Microbiol., 137, 1193-1199 (1991)!.

In addition, as for hyperthermophiles of the genera Thermococcus,Staphylothermus and Thermobacteroides, the production of proteases havealso been known Applied Microbiology and Biotechnology, 34, 715-719(1991)!.

OBJECTS OF THE INVENTION

Since proteases produced by these hyperthermophiles have highthermostabilities, they are expected to be applicable to newapplications to which any known enzyme has not been utilized. However,the above publications merely teach that thermostable proteaseactivities are present in cell-free extracts or crude enzyme solutionsobtained from culture supernatants and there is no disclosure aboutproperties of isolated and purified enzymes and the like. Moreover,since a cultivation of microorganisms at high temperature is required toobtain enzymes from these hyperthermophiles, there is a problem inindustrial production of the enzymes.

In order to solve the above problems, an object of the present inventionis to isolate a gene encoding a protease of a hyperthermophile. Anotherobject of the present invention is to provide a process for producingthe protease by genetic engineering using the gene.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates a restriction map of the plasmid pTPR1.

FIG. 2 illustrates a restriction map of the plasmid pTPR9.

FIG. 3 illustrates a restriction map of the plasmid pTPR12.

FIG. 4 illustrates comparison of restriction maps of DNA's derived fromPyrococcus furiosus contained in the plasmids.

FIG. 5 illustrates a restriction map of the plasmid pTPR15.

FIG. 6 illustrates a restriction map of the plasmid pTPR15.

FIG. 7 illustrates a restriction map of the plasmid pTPR13.

FIG. 8 illustrates a restriction map of the plasmid pUBR13.

FIG. 9 illustrates a restriction map of the plasmid pUBR36.

FIG. 10 illustrates a design of an oligonucleotide PRO-1F.

FIG. 11 illustrates designs of oligonucleotides PRO-2F and PRO-2R.

FIG. 12 illustrates a design of an oligonucleotide PRO-4R.

FIG. 13 illustrates a restriction map of the plasmid p2F-4R.

FIG. 14 illustrates a restriction map of the plasmid pTC1.

FIG. 15 illustrates a restriction map of the plasmid pTC3.

FIG. 16 illustrates thermostability of the hyperthermostable proteaseobtained in the present invention.

FIG. 17 illustrates the optimum pH of the hyperthermostable proteaseobtained in the present invention.

FIG. 18 illustrates the activity staining pattern aftergelatin-containing SDS-polyacrylamide gel electrophoresis of thehyperthermostable protease obtained in the present invention.

DISCLOSURE OF THE INVENTION

In order to obtain a hyperthermostable protease gene, the presentinventors attempted to purify a protease from microbial cells and aculture supernatant of Pyrococcus furiosus DSM 3638 so as to determine apartial amino acid sequence of the enzyme, independently. However,purification of the protease was very difficult in either case of usingthe microbial cells or the culture supernatant and the present inventorsfailed to obtain an enzyme sample having sufficient purity fordetermination of its partial amino acid sequence.

As a method for cloning a gene for an objective enzyme without anyinformation about a primary structure of the enzyme, there is anexpression cloning method and, for example, a pullulanase geneoriginating in Pyrococcus woesei has been obtained according to thismethod (WO 92/02614). However, in an expression cloning method, aplasmid vector is generally used and, in such case, it is necessary touse restriction enzymes which can cleave an objective gene intorelatively small DNA fragments so that the fragments can be insertedinto the plasmid vector without cleavage of any internal portion of theobjective gene. Then, the method is not always applicable to cloning ofall kinds of enzyme genes. Furthermore, it is necessary to test for anenzyme activity of a large number of clones and this operation iscomplicated.

The present inventors have attempted to isolate a protease gene by usinga cosmid vector which can maintain a larger DNA fragment (35-50 kb)instead of a plasmid vector to prepare a cosmid library of Pyrococcusfuriosus genome and investigating cosmid clones in the library to findout a clone expressing a protease activity. By using a cosmid vector,the number of transformants to be screened can be reduced in addition tolowering of possibilities of cleavage of an internal portion of theenzyme gene. On the other hand, since the copy number of a cosmid vectorin a host is not higher than that of a plasmid vector, it may be that anamount of the enzyme expressed is too small to detect its enzymeactivity.

In view of high thermostability of the objective enzyme, firstly, thepresent inventors have cultured respective transformants in a cosmidlibrary, separately, and have combined this step with a step forpreparing lysates containing ing only thermostable proteins from themicrobial cells thus obtained. This group of lysates have been named asa cosmid protein library. By using the cosmid protein library indetection of the enzyme activity, detection sensitivity can be increasedhigher than that of a method using transformant colonies.

In addition, the present inventors have made possible to detect a traceamount of the enzyme activity by performing SDS-polyacrylamidegel-electrophoresis with a gel containing gelatin. According to thismethod, a trace amount of a protease activity contained in a sample canbe detected with high sensitivity as a band concentrated in the gel.

In this manner, the present inventors have screened a cosmid proteinlibrary originating in Pyrococcus furiosus and have obtained severalcosmid clones which express the protease activity.

Furthermore, the present inventors have succeeded in isolation ofhyperthermostable protease genes from inserted DNA fragments containedin the clones by utilizing various gene engineering techniques and alsohave found that products expressed from the genes are resistant tosurfactants.

By comparing an amino acid sequence of the hyperthermostable proteasededuced from the nucleotide sequence of the gene with amino acidsequences of known proteases originating in microorganisms, homology ofthe amino acid sequence of the front half portion of the proteaseencoded by the gene with those of a group of alkaline proteases, whoserepresentative example is subtilisin, has been shown and, in particular,very high homology has been found at each region around the four aminoacid residues which are known to be of importance for a catalyticactivity of the enzymes. Thus, since the protease produced by Pyrococcusfuriosus, which is active at such high temperatures that proteasesoriginating in mesophiles are readily inactivated, has been shown toretain a structure similar to those of enzymes from mesophiles, it hasbeen suggested that similar proteases would also be produced byhyperthermophiles other than Pyrococcus furiosus.

Then, the present inventors have noted possibilities that, in thenucleotide sequence of the hyperthermostable protease gene obtained, thenucleotide sequences encoding regions showing high homology withsubtilisin and the like can be used as probes for investigatinghyperthermostable protease genes and have attempted to detect proteasegenes originating in hyperthermophiles by PCR using synthetic DNAdesigned based on the above nucleotide sequences as primers so as toclone DNA fragments containing the protease genes. As a result, thepresent inventors have found a protease gene in a hyperthermophile,Thermococcus celer DSM2476, and have obtained a DNA fragment containingthe gene. Furthermore, the present inventors have confirmed that anamino acid sequence encoded by the DNA fragment contains amino acidssequences having high homology with the amino acid sequences of thehyperthermostable protease represented by SEQ NO 1 of the SequenceListing. Thus, the present inventors have completed the presentinvention.

That is, the present invention provides an isolated hyperthermostableprotease genes originating in Pyrococcus furiosus, in particular, ahyperthermostable protease gene which comprises the amino acid sequencerepresented by SEQ ID NO 1 in the Sequence Listing or a part thereofencoding the active portion of the hyperthermostable protease,especially, the hyperthermostable protease gene having the DNA sequencerepresented by SEQ ID NO 2 in the Sequence Listing.

In addition, the present invention provides hyperthermostable proteasegenes hybridizable with the above hyperthermostable protease genes. Forexample, there is provided a hyperthermostable protease gene containingthe nucleotide sequence represented by SEQ ID NO 7 in the SequenceListing.

Moreover, the present invention provides a process for producing thehyperthermostable protease which comprising culturing a transformanttransformed with a recombinant plasmid into which the hyperthermostableprotease gene of the present invention has been inserted, and collectingthe hyperthermostable protease from the culture.

The hyperthermostable protease genes of the present invention can beobtained by screening of gene libraries of hyperthermophiles. As thehyperthermophiles, bacteria belonging to the genus Pyrococcus can beused and the desired genes can be obtained by screening a cosmid libraryof Pyrococcus furiosus genome.

For example, Pyrococcus furiosus DSM3638 can be used as Pyrococcusfuriosus and this strain is available from Deutsch Sammlung vonMicroorganismen und Zellkulturen GmbH.

One example of cosmid libraries of Pyrococcus furiosus can be obtainedby partially digesting the genomic DNA of Pyrococcus furiosus DSM3638with a restriction enzyme, Sau3AI (manufactured by Takara Shuzo, Co.,Ltd.), to obtain DNA fragments, ligating the DNA fragments with a triplehelix cosmid vector (manufactured by Stratagene) and packaging intolambda phage particles by in vitro packaging method. Then, the libraryis transduced into a suitable E. coli, for example, E. coli DH5αMCR(manufactured by BRL) to obtain transformants, followed by culturingthem, collecting the microbial cells, subjecting them to heat treatment(100° C. for 10 minutes), sonicating and subjecting heat treatment (100°C. for 10 minutes) again. The lysates thus obtained can be subjected toscreening for the protease activity by performing SDS-polyamidegel-electrophoresis with a gel containing gelatin.

In this manner, a cosmid clone containing a hyperthermostable proteasegene capable of expressing a protease which is resistant to the aboveheat treatment can be obtained.

Furthermore, a cosmid DNA prepared from the above-obtained cosmid clonecan be digested with suitable restriction enzymes to form fragments toprepare recombinant plasmids into which the respective fragments thusobtained are inserted. A recombinant plasmid containing the desiredhyperthermostable protease gene can be obtained by transforming asuitable microorganism with the above-obtained plasmids and testing forthe protease activity expressed by the resultant transformants.

That is, a cosmid DNA prepared from one of the above-obtained cosmidclones can be digested with SphI (manufactured by Takara Shuzo, Co.,Ltd.), followed by inserting the resultant DNA fragment into SphI siteof a plasmid vector, pUC119 (manufactured by Takara Shuzo, Co., Ltd.) toobtain a recombinant plasmid. Then, the recombinant plasmid isintroduced into E. coli JM109 (manufactured by Takara Shuzo, Co., Ltd.)and the protease activity of the resultant transformant is tested by thesame method as that used for screening of the cosmid protein library.The transformant having the activity is used for preparation of aplasmid.

As is seen from Examples hereinafter, one of the recombinant plasmidshas been named as pTPR1 and E. coli JM109 transformed with the plasmidhas been named as Escherichia coli JM109/pTPR1. FIG. 1 illustrates arestriction map of the plasmid pTPR1. In FIG. 1, the thick solid linerepresents the DNA fragment inserted into the plasmid vector pUC119. Therecombinant plasmid contains SphI fragment of about 7.0 kb.

In addition, a DNA fragment of about 2.5 kb which does not contain thehyperthermostable protease gene can be removed from the recombinantplasmid. That is, among three fragments of about 2.5 kb, about 3.3 kband about 4.3 kb obtained by digesting the above plasmid pTPR1 with XbaI(manufactured by Takara Shuzo, Co., Ltd.), only the DNA fragment ofabout 2.5 kb is removed and the remaining fragments are ligated andintroduced into E. coli JM109. The protease activity of the resultanttransformant is tested by the same method as that used for screening ofthe cosmid protein library. The resultant transformant having theprotease activity is used for preparation of a plasmid. The plasmid hasbeen named as pTPR9 and E. coli JM109 transformed with the plasmid hasbeen named as Escherichia coli JM109/pTPR9. FIG. 2 illustrates arestriction map of the plasmid pTPR9. In FIG. 2, the thick solid linerepresents the DNA fragment inserted into the plasmid vector pUC119.

The protease activities expressed by both plasmids pTPR1 and pTPR9 showhigh thermostability. However, since the activities are observed atpositions different from that for the protease activity expressed byabove cosmid clone on a SDS-polyacrylamide gel containing gelatin, theseplasmids are estimated to be defect in a part of the protease gene onthe cosmid DNA. A DNA fragment containing the whole length of theprotease gene can be obtained from the cosmid DNA by, for example, usinga part of the inserted DNA fragment of the above plasmid pTPR9 as aprobe. That is, the cosmid DNA used for preparation of the plasmid pTPR1is digested with NotI (manufactured by Takara Shuzo, Co., Ltd.) andseveral restriction enzymes which do not cleave any internal portion ofthe DNA fragment inserted into the plasmid pTPR1. After agarosegel-electrophoresis, the DNA fragments in the gel are blotted on a nylonmembrane. Regarding the membrane thus obtained, hybridization is carriedout by using a PstI-XbaI fragment of about 0.7 kb obtained from the DNAfragment inserted into the plasmid pTPR9 as a probe to detect a DNAfragment containing the same sequence as that of the PstI-XbaI fragment.

In the cosmid DNA digested with two enzymes, NotI and PvuII(manufactured by Takara Shuzo Co., Ltd.), a DNA fragment of about 7.5 kbis hybridized with the PstI-XbaI fragment. This fragment of about 7.5 kbcan be isolated to insert into a plasmid vector, pUC19 (manufactured byTakara Shuzo, Co., Ltd.) into which NotI linker (manufactured by TakaraShuzo Co., Ltd.) has been introduced at a HincII site, at a site betweenNotI and SmaI. The plasmid has been named as pTPR12 and E. coli JM109transformed by the plasmid has been named and indicated as Escherichiacoli JM109/pTPR12. This strain has been deposited with NationalInstitute of Bioscience and Human-Technology (NIBH), Agency ofIndustrial Science & Technology, Ministry of International Trade &Industry under the accession number of FERM BP-5103 under BudapestTreaty since May 24, 1994 (the date of the original deposit).

A lysate of Escherichia coli JM109/pTPR12 shows the protease activitysimilar to that of the cosmid clone on a SDS-polyacrylamide gelcontaining gelatin. FIG. 3 illustrates a restriction map of the plasmidpTPR12. In FIG. 3, the thick solid line is the DNA fragment insertedinto the plasmid vector pUC19.

FIG. 4 illustrates restriction maps of the DNA fragments originating inPyrococcus furiosus which are inserted into the plasmids pTPR1, pTPR9and pTPR12, respectively. According to FIG. 4, a fragment of about 1 kbwhich does not contain a hyperthermostable protease gene can be removedfrom the DNA fragment inserted into the plasmid pTPR12. That is, theplasmid pTPR12 is digested with XbaI and KpnI (manufactured by TakaraShuzo, Co., Ltd.) and thus-obtained XbaI-XbaI fragment of about 3.3 kband XbaI-KpnI fragment of about 3.2 kb are isolated, respectively. Then,firstly, the XbaI-KpnI fragment of about 3.2 kb is inserted into theplasmid vector pUC19 at a site between XbaI and KpnI to prepare arecombinant plasmid. This plasmid has been named as pTPR14 and FIG. 5illustrates its restriction map. In FIG. 5, the thick solid linerepresents the DNA fragment inserted into the plasmid vector pUC19.

Then, the above XbaI-XbaI fragment of about 3.3 kb is inserted into theplasmid pTPR14 at XbaI site and introduced into E. coli JM109. Theprotease activity of the transformant is tested by using the method usedfor screening the cosmid protein library. A plasmid is prepared by thetransformant having the activity. The plasmid has been named as pTPR15and E. coli JM109 transformed with the plasmid has been named asEscherichia coli JM109/pTPR15. FIG. 6 illustrates a restriction map ofpTPR15. In FIG. 6, the thick solid line represents the DNA fragmentinserted into the plasmid vector pUC19.

Further, in the nucleotide sequences of the DNA fragment originating inPyrococcus furiosus and inserted into the plasmid pTPR15, the nucleotidesequence of the DNA fragment of about 4.8 kb between two DraI sites areshown as SEQ ID NO 8 in the Sequence Listing. That is, SEQ ID NO 8 ofthe Sequence Listing is an example of the nucleotide sequence of thehyperthermostable protease gene of the present invention. And, an aminoacid sequence of a product of the gene deduced from the nucleotidesequence of SEQ ID NO 8 is shown as SEQ ID NO 9 in the Sequence Listing.That is, SEQ ID NO 9 in the Sequence Listing is an example of the aminoacid sequence of an enzyme protein produced by using thehyperthermostable protease gene obtained according to the presentinvention.

Because it has been found that the hyperthermostable protease gene ofthe present invention is contained in DraI fragment of about 4.8 kb inthe DNA fragment inserted into the above plasmid pTPR15, a recombinantplasmid containing only this DraI fragment can be prepared.

That is, the above plasmid pTPR15 is digested with DraI (manufactured byTakara Shuzo Co., Ltd.) to isolate the resultant DNA fragment of about4.8 kb. Then, it can be inserted into the plasmid vector pUC19 at SmaIsite to prepare a recombinant plasmid. The recombinant plasmid has beennamed as pTPR13 and E. coli JM109 transformed with the plasmid has beennamed as Escherichia coli JM109/pTPR13.

A lysate of Escherichia coli JM109/pTPR13 shows the same proteaseactivity as that of the cosmid clone on a SDS-polyacrylamide gelcontaining gelatin. FIG. 7 illustrates a restriction map of the plasmidpTPR13. In FIG. 7, the thick solid line represents the DNA fragmentinserted into the plasmid vector pUC19.

In addition, the hyperthermostable protease gene of the presentinvention can be expressed in Bacillus subtilis. As the Bacillussubtilis, Bacillus subtilis DB104 can be used and this strain is a knownstrain described in Gene, Vol. 83, pp. 215-233 (1989). As a cloningvector, a plasmid pUB18-P43 can be used and this plasmid has been givenby Dr. Sui-Lam Wong of Calgary University. This plasmid contains akanamycin resistant gene as a selection marker.

The above-described plasmid pTPR13 can be digested with KpnI(manufactured by Takara Shuzo Co., Ltd.) and BamHI (manufactured byTakara Shuzo Co., Ltd.) to obtain a DNA fragment of about 4.8 kb,followed by isolating and ligating the fragment between KpnI site andBamHI site of the plasmid pUB18-P43 to prepare a recombinant plasmid.The plasmid has been named as pUBP13 and Bacillus subtilis DB104transformed with the plasmid has been named as Bacillus subtilisDB104/pUBP13. A lysate of Bacillus subtilis DB104/pUBP13 shows the sameprotease activity as that of the cosmid clone on a SDS-polyacrylamidegel containing gelatin. FIG. 8 illustrates a restriction map of theplasmid pUBP13. In Fig. 8, the thick solid line represents the DNAfragment inserted into the plasmid vector pUB18-P43.

By comparing the amino acid sequence shown by SEQ ID NO 9 of theSequence Listing with amino acid sequences of proteases originating inknown microorganisms, it is shown that there is homology between thefront half portion of the sequence of the hyperthermostable protease ofthe present invention and those of a group of alkaline serine proteaseswhose representative example is subtilisin Protein Engineering, Vol. 4,pp. 719-737 (1991)!, in particular, there is high homology between eachregion around the four amino acid residues which are known to be ofimportance for protease activity. On the other hand, such homologycannot be observed between the back half portions of the amino acidsequences and it is considered that this portion may not be essential toa protease activity. Therefore, a mutant protease wherein an appropriatepeptide chain is removed from its back half portion is expected to showthe enzymatic activity. Examples of such mutant protease include aprotease having an amino acid sequence corresponding to SEQ ID. NO 9 ofthe Sequence Listing from which the 904th amino acid, Ser, and thesubsequent sequence has been removed. This can be prepared by thefollowing process.

Firstly, a KpnI-EcoRI fragment of about 2.8 kb wherein the EcoRI site isblunted is prepared from the above plasmid pTPR13 and the fragment isligated between the KpnI site and the blunted XbaI site of the plasmidvector pUC119. A protease gene contained in the recombinant plasmid thusobtained encodes an amino acid sequence corresponding to the SEQ ID NO 9of the Sequence Listing except that the nucleotide sequence TCA encodingthe 904th amino acid, Ser, has been replaced with the termination codonTAG and the subsequent nucleotide sequence has been deleted. The plasmidhas been named as pTPR36 and E. coli JM109 transformed with the plasmidhas been named Escherichia coli JM109/pTPR36. A lysate of Escherichiacoli JM109/pTPR36 shows an protease activity on a SDS-polyacrylamide gelcontaining gelatin. FIG. 9 illustrates a restriction map of the plasmidpTPR36. In FIG. 9, the thick solid line represents the DNA fragmentinserted into the plasmid vector pUC119. SEQ ID NO 2 in the SequenceListing is a nucleotide sequence of the open reading frame contained inthe DNA fragment inserted in the plasmid pTPR36. That is, SEQ ID NO 2 ofthe Sequence Listing is an example of nucleotide sequences of thehyperthermostable protease genes obtained in the present invention. Inaddition, SEQ ID NO 1 of the Sequence Listing is an amino acid sequenceof the gene product deduced from the nucleotide sequence of SEQ ID NO 2.That is, SEQ ID NO 1 of the Sequence Listing is an example of amino acidsequences of enzyme proteins produced by using hyperthermostableprotease genes obtained by the present invention.

As described above, it has been found that the regions commonly presentin alkaline serine proteases originating in mesophiles are conserved inthe amino acid sequence of the hyperthermostable protease produced bythe hyperthermophile Pyrococcus furiosus. Therefore, the presence of theregions is expected in the same kind of proteases produced byhyperthermophiles other than Pyrococcus furiosus. That is, it ispossible to obtain genes for hyperthermostable proteases similar to theabove-described hyperthermostable protease by preparing suitablesynthetic DNA fragments based on parts of the nucleotide sequence of SEQID NO 2 of the Sequence Listing which encode amino acid sequences havinghigh homology with those of subtilisin and the like, and using them asprobes or primers.

FIGS. 10, 11 and 12 illustrate the relation among the amino acidsequences of regions in the amino acid sequence of the hyperthermostableprotease of the present invention which have high homology with those ofsubtilisin and the like, the nucleotide sequences of thehyperthermostable protease gene of the present invention which encodethe regions, and the nucleotide sequences of oligonucleotides PRO-1F,PRO-2F, PRO-2R and PRO-4R synthesized based on the above sequences,respectively. In addition, SEQ ID NO 3, 4, 5 and 6 of the SequenceListing illustrate the nucleotide sequences of the oligonucleotidesPRO-1F, PRO-2F, PRO-2R and PRO-4R. That is, SEQ ID NO 3, 4, 5 and 6 ofthe Sequence Listing are examples of oligonucleotides which can be usedfor detection of the hyperthermostable protease genes of the presentinvention by hybridization.

A combination of above oligonucleotides can be used as primers to carryout PCR using genomic DNA of various hyperthermophiles as templates todetect protease genes present in hyperthermophiles. As thehyperthermophiles, bacteria belonging to the genera Pyrococcus,Thermococcus, Staphylothermus, Thermobacteroides and the like can beused. As bacteria belonging to the genus Thermococcus, Thermococcusceler DSM2476 can be used and the strain is available from DeutschSammlung von Microorganismen und Zellkulturen GmbH. When PCR is carriedout by using genomic DNA of Thermococcus celer DSM2476 as a template anda combination of the above oligonucleotides PRO-1F and PRO-2R or acombination of PRO-2F and PRO-4R as primers, specific amplification ofDNA fragments is observed and the presence of a protease gene can beindicated. In addition, an amino acid sequence encoded by the fragmentcan be estimated by ligating the fragment to suitable plasmid vector toprepare a recombinant plasmid and determining the nucleotide sequence ofthe inserted DNA fragment by dideoxy method.

A DNA fragment of about 150 bp amplified by using the oligonucleotidesPRO-1F and PRO-2R and a DNA fragment of about 550 bp amplified by usingthe oligonucleotides PRO-2F and PRO-4R are ligated to HincII site of theplasmid vector pUC18 to obtain recombinant plasmids, respectively. Therecombinant plasmids have been named as p1F-2R(2) and p2F-4R,respectively. SEQ ID NO 10 of the Sequence Listing illustrates thenucleotide sequence of the DNA fragment inserted into the plasmidp1F-2R(2) and an amino acid sequence deduced therefrom. SEQ ID NO 11 ofthe Sequence Listing illustrates the nucleotide sequence of the DNAfragment inserted into the plasmid p2F-4R and an amino acid sequencededuced therefrom. In the nucleotide sequence shown by SEQ ID NO 10 ofthe Sequence Listing, the sequence from the 1st to the 21st nucleotidesand the sequence from the 113th to the 145th nucleotides and, in thenucleotide sequence shown by SEQ ID NO 11 of the Sequence Listing, thesequence from the 1st to the 32nd nucleotides and the sequence from the532nd to the 564th nucleotides are the nucleotide sequences derived fromthe oligonucleotides used as the primers (corresponding to theoligonucleotides PRO-1F, PRO-2R, PRO-2F and PRO-4R, respectively). Inthe amino acid sequences shown by SEQ ID NO 10 and 11, there aresequences having homology with amino acid sequences of thehyperthermostable protease originating in Pyrococuss furiosus of thepresent invention as well as alkaline serine protease originating invarious microorganisms and it has been shown that the above DNAfragments amplified by PCR are those amplified utilizing the proteasegene as the template.

FIG. 13 illustrates a restriction map of the plasmid p2F-4R. In FIG. 13,the thick solid line represents the DNA fragment inserted into theplasmid vector pUC18.

On the other hand, when genomic DNA of Thermobacteroides proteoliticusDSM5265 and Staphylothermus marinus DSM3639 are used as templates,amplification as observed in case of Thermococcus celer has not beenrecognized.

It has been known that efficiency of gene amplification by PCR isinfluenced by annealing efficiency of a 3'-terminal portion of a primerand a template DNA. Even when amplification of DNA fragment is notobserved in the above PCR, protease genes can be detected bysynthesizing oligonucleotides having different sequences but encodingthe same amino acid sequence and using them as primers. In addition,protease genes can also be detected by using these oligonucleotides asprobes and carrying out Southern hybridization with genomic DNA ofvarious hyperthermophiles.

Then, the above-described oligonucleotides or amplified DNA fragmentsobtained by the above PCR can be used as probes for screening genomicDNA libraries of hyperthermophiles to obtain hyperthermostable proteasegenes, for example, the hyperthermostable protease gene produced byThermococcus celer.

As an example of genomic DNA libraries of Thermococcus celer, there is alibrary prepared by partially digesting a genomic DNA of Thermococcusceler DSM2476 with a restriction enzyme Sau3AI to obtain a DNA fragment,ligating the fragment with lambda GEM-11 vector (manufactured byPromega) and packaging it into lambda phage particles according to invitro packaging method. Then, the library is transduced into a suitableE. coli, for example, E. coli LE392 (manufactured by Promega) to formplaques on a plate and then plaque hybridization is carried out by usingamplified DNA fragments obtained in the above-described PCR. In thismanner, phage clones containing hyperthermostable protease genes can beobtained.

Further, the phage DNA prepared from the clone thus obtained is digestedwith suitable restriction enzymes and, after subjecting to agarosegel-electrophoresis, DNA fragments in the gel are blotted on a nylonmembrane. Regarding the membrane thus obtained, hybridization is carriedout using amplified DNA fragments obtained according to the above PCR asprobes to detect a DNA fragments containing the protease gene.

When the above phage DNA is digested with KpnI, a DNA fragment of about9 kb is hybridized with the probe and this fragment of about 9 kb can beisolated and inserted into KpnI site of the plasmid vector pUC119 toobtain a recombinant plasmid. This plasmid has been named as pTC1 and E.coli JM109 transformed with this plasmid has been named as Escherichiacoli JM109/pTC1.

FIG. 14 illustrates a restriction map of the plasmid pTC1. In FIG. 14,the thick solid line represents the DNA fragment inserted into theplasmid vector pUC119.

Furthermore, a DNA fragment of about 4 kb which does not contain thehyperthermostable protease gene can be removed from the plasmid pTC1.That is, plasmid pTC1 is digested with KpnI and several restrictionenzymes which cleave the region within the fragment inserted into theplasmid pTC1 and, after subjecting to agarose gel-electrophoresis,detection of a DNA fragment containing the protease gene is carried outaccording to the same manner as that for the above phage DNA. When theplasmid pTC1 is digested with KpnI and BamHI, a DNA fragment of about 5kb is hybridized with the probe and this fragment of about 5 kb can beisolated and introduced into KpnI-BamH site of the plasmid vector pUC119to obtain a recombinant plasmid. This plasmid has been named as pTC1 andE. coli JM109 transformed with this plasmid has been named asEscherichia coli JM109/pTC3. FIG. 15 illustrates a restriction map ofthe plasmid pTC3. In FIG. 15, the thick solid line represents the DNAfragment inserted into the plasmid vector pUC119.

The nucleotide sequence of the hyperthermostable protease gene containedin the DNA fragment inserted into the plasmid pTC3 can be determined byusing specific primers, i.e., by using suitable oligonucleotidessynthesized based on the nucleotide sequences as shown by SEQ ID NO 10and 11 of the Sequence Listing as primers. SEQ ID NO 12, 13, 14, 15, 16and 17 represent the nucleotide sequences of the oligonucleotides TCE-2,TCE-4, SEF-3, SER-1, SER-3 and TCE-6R which have been used as theprimers for determination of the nucleotide sequence of thehyperthermostable protease gene. In addition, SEQ ID NO 7 of theSequence Listing represents a part of the nucleotide sequence of thehyperthermostable protease gene thus obtained. That is, SEQ ID NO 7 is apart of the nucleotide sequence of the hyperthermostable protease geneof the present invention. Moreover, SEQ ID NO 18 represents an aminoacid sequence of an example of the enzyme encoded by thehyperthermostable protease gene obtained by the present invention. Inthe DNA fragment inserted into the plasmid pTC3, the sequence derivedfrom lambda GEM-11 vector is adjacent to the 5'-end of the nucleotidesequence represented sented by SEQ ID NO 7 of the Sequence Listing,indicating a defect in a part of the 5'-region of the protease gene. Inaddition, by comparing the nucleotide sequence with those of SEQ ID NO10 and 11 of the Sequence Listing, it has been found that the DNAfragment inserted into the plasmid pTC3 contains the 41st and thesubsequent nucleotides of the nucleotide sequence represented by SEQ IDNO 10 and the whole nucleotide sequence of SEQ ID NO 11.

Although the hyperthermostable protease gene obtained from Thermococcusceler is defect in a part thereof, as is obvious to a person skilled inthe art, a DNA fragment containing the whole length of thehyperthermostable protease gene can be obtained, for example, (1) byrepeating screening of a genomic DNA library, (2) by carrying outSouthern hybridization with genomic DNA, (3) by obtaining a DNA fragmentof the 5'-upstream region by PCR with a cassette (manufactured by TakaraShuzo Co., Ltd.) and cassette primers (manufactured by Takara Shuzo Co.,Ltd.) (Takara Shuzo's Genetic Engineering Products Guide, 1994-1995 ed.,pp. 250-251), and the like.

A transformant into which a recombinant plasmid containing with thehyperthermostable protease gene is transduced, for example, Escherichiacoli JM109/pTPR13 or Escherichia coli JM109/pTPR36, can be culturedunder conventional conditions, for example, by culturing thetransformant in LB medium trypton (10 g/liter), yeast extract (5g/liter), NaCl (5 g/liter); pH 7.2! containing 100 μg/ml of ampicillinat 37° C. to express the hyperthermostable protease in the culture.After completion of culture, the cultured cells are harvested and thecells are sonicated and centrifuged. The supernatant is subjected toheat treatment at 100° C. for 5 minutes to denature and removecontaminated proteins. In this way, a crude enzyme sample can beobtained. The crude enzyme samples thus obtained from Escherichia coliJM109/pTPR13 and Escherichia coli JM109/pTPR36 have been named as PF-13and PF-36.

Further, a transformant into which a recombinant plasmid containing thehyperthermostable protease gene is transduced, Bacillus subtilisDB104/pUBP13, can be cultured under conventional conditions, forexample, by culturing the transformant in LB medium containing 10 μg/mlof kanamycin at 37° C. to express the hyperthermostable protease in theculture. After completion of culture, the cultured cells are harvestedand the cells are sonicated and centrifuged. The supernatant issubjected to heat treatment at 100° C. for 5 minutes to denature andremove contaminated proteins, followed by salting out with ammoniumsulfate and dialysis. In this way, a partially purified enzyme samplecan be obtained. The roughly purified enzyme sample thus obtained fromBacillus subtilis DB104/pUBP13 has been named as PF-BS13.

The enzymatic and physicochemical properties of the hyperthermostableprotease samples produced by the transformants into which therecombinant plasmids containing the hyperthermostable protease genesderived from Pyrococcus furiosus obtained by the present invention, forexample, PF-13, PF-36 and PF-BS13 are as follows.

(1) Activity

The enzymes obtained by the present invention hydrolyze gelatin to formshort chain polypeptides. In addition, they hydrolyze casein to formshort chain polypeptides.

(2) Method for detecting enzyme activity The detection of enzymeactivity was carried out by detection of hydrolysis of gelatin with theenzyme on a SDS-polyacrylamide gel. Namely, an enzyme sample to betested was suitably diluted and to 10 μl of the sample diluted solutionwas added 2.5 μl of a sample buffer solution (50 mM Tris-HCl pH 7.6, 5%SDS, 5% 2-mercaptoethanol, 0.005% Bromophenol Blue, 50% glycerol). Themixture was subjected to heat treatment at 100° C. for 5 minutes andthen electrophoresis by using 0.1% SDS-10% polyacrylamide gel containing0.05% gelatin. After completion of electrophoresis, the gel was soakedin 50 mM potassium phosphate buffer (pH 7.0) and incubated at 95° C. for2 hours to carry out the enzymatic reaction. Then, the gel was stainedwith 2.5% Coomassie Brilliant Blue R-250 in 25% ethanol and 10% aceticacid for 30 minutes and further the gel was transferred in 25% ethanoland 7% acetic acid to remove excess dye over 3 to 15 hours. Gelatinhydrolyzed with the protease into peptides was diffused outside of thegel during the enzymatic reaction and the corresponding position was notstained with Coomassie Brilliant Blue, thereby detecting the presence ofthe protease activity. The enzyme samples obtained by the presentinvention, PF-13, PF-36 and PF-BS13, had gelatin hydrolyzing activity at95° C.

In addition, the casein hydrolyzing activity was detected according tothe same manner as described above except that a 0.1% SDS-10%polyacrylamide gel containing 0.05% of casein was used. The enzymesamples obtained by the present invention, PF-13, PF-36 and PF-BS13, hadcasein hydrolyzing activity at 95° C.

Moreover, casein hydrolyzing activity of the enzyme sample obtained bythe present invention, PF-BS13, was determined by the following method.To 100 μl of 0.1M potassium phosphate buffer (pH 7.0) containing 0.2%casein was added 100 μl of a suitably diluted enzyme solution andincubated at 95° C. for 1 hour. The reaction was stopped by addition of100 μl of 15% trichloroacetic acid and the reaction mixture wascentrifuged. The amount of acid soluble short chain polypeptidescontained in the supernatant was determined by measuring absorbance at280 nm and the enzyme activity was determined by comparing theabsorbance with that of an enzyme free control. The enzyme sampleobtained by the present invention, PF-BS13, had casein hydrolyzingactivity under the experimental conditions of pH 7.0 at 95° C.

(3) Stability Stability of the enzyme was examined by detectingremaining enzymatic activity of heat treated enzymes by theabove-described method (2) using SDS-polyacrylamide gel containinggelatin. Namely, the enzyme sample was incubated at 95° C. for 3 hoursand then a suitable amount thereof was subjected to detection of theenzymatic activity to compare its activity with that without treatmentat 95° C. Although the position of enzyme activity on the gel wassomewhat changed due to incubation at 95° C., lowering of the enzymeactivity was scarcely observed. The enzyme samples obtained by thepresent invention, PF-13, PF-36 and PF-BS13, were stable to heattreatment at 95° C. for 3 hours.

In addition, stability of the enzyme samples obtained by the presentinvention, PF-13 and PF-36, in the presence of surfactants were tested.Namely, Triton X-100, SDS or benzalkonium chloride was added to theenzyme samples in the final concentration of 0.1%. The mixture wasincubated at 95° C. for 3 hours and a suitable amount thereof wassubjected to detection of the enzymatic activity. For each surfactant,no substantial change in the enzyme activity was found in comparisonwith that in the absence of the surfactant. Then, the enzyme samplesobtained by the present invention, PF-13 and PF 36, were stable to heattreatment at 95° C. for 3 hours in the presence of surfactants.

Moreover, stability of the enzyme sample obtained by the presentinvention, PF-BS13, was tested by the following method. Namely, theenzyme sample as such or with addition of SDS in the final concentrationof 0.1% was incubated at 95° C. for various periods of time and theremaining activity was determined by the above-describedspectrophotometric method (2) based on increase in the amount of acidsoluble polypeptides. FIG. 16 illustrates thermostability of the enzymesample obtained by the present invention, PF-BS13. The ordinateindicates the remaining activity (%) and the abscissa indicatesincubation time (hr). In FIG. 16, the open circle represents the resultsobtained without addition of SDS and the closed circle represents theresults in the presence of 0.1% SDS. As seen from FIG. 16, PF-BS13maintained almost 100% activity after incubation at 95° C. for 4 hoursregardless of the presence or absence of 0.1% SDS.

(4) Effect of various reagents

The enzyme samples were subjected to SDS-polyacrylamide gel containinggelatin and then the enzymatic reaction was carried out in 50 mMpotassium phosphate buffer (pH 7.0) containing 2 mM EDTA or 2 mMphenylmethanesulfonyl fluoride (PMSF) to test for effect of bothreagents on the enzyme activity. No substantial difference in the enzymeactivities of the enzyme samples obtained by the present invention,PF-13, PF-36 and PF-BS13, was observed between the buffer containing 2mM EDTA and 50 mM potassium phosphate buffer alone. On the other hand,when the buffer containing 2 mM PMSF was used, the amount of hydrolyzedgelatin in the gel was decreased in all the samples, indicating that theactivities of the enzyme samples were inhibited by PMSF.

(5) Molecular weight

The molecular weight of the enzyme sample obtained by the presentinvention on a SDS-polyacrylamide gel containing ing gelatin wasestimated. The enzyme sample, PF-13, showed plural active bands withinthe range of 95 kDa to 51 kDa. Although the migration distance wasvaried according to the amount of a sample applied, etc., the majorbands of 84 kDa, 79 kDa, 66 kDa, 54 kDa and 51 kDa were appeared. Whenthe enzyme sample was subjected to electrophoresis after heat treatmentat 95° C. for 3 hours in the presence of SDS in the final concentrationof 0.1%, the bands of 63 kDa and 51 kDa became intensive. For the enzymesample, PF-BS13, the same results as that of the above with respect tothe enzyme sample PF-13 were obtained. In case of the enzyme samplePF-36, several minor bands were observed in addition to the main bandsof 63 kDa and 59 kDa.

(6) Optimum pH

The optimum pH of the enzyme samples PF-13 and PF-36 obtained by thepresent invention was tested. After subjecting the enzyme samples toelectrophoresis on a SDS-polyacrylamide gel containing gelatin, the gelwas soaked in buffers having different pH and the enzyme reaction wascarried out to test for the optimum pH. As the buffers, 50 mM sodiumacetate buffer solution at pH 4.0 to 6.0, 50 mM potassium phosphatebuffer solution at pH 6.0 to 8.0, 50 mM sodium borate buffer solution atpH 9.0 to 10.0 were used. Both enzyme samples showed gelatin hydrolyzingactivity at pH 6.0 to 10.0 and their optimum pH was pH 8.0 to 9.0.

In addition, the optimum pH of the enzyme sample obtained by the presentinvention, PF-BS13, was determined by the above-describedspectrophotometric method (2) based on increase in the amount of acidsoluble polypeptides. 0.2% Casein solutions to be used for thedetermination were prepared by using 0.1M sodium acetate butter solutionat pH 4.0 to 6.0, 0.1M potassium phosphate buffer solution at pH 6.0 to8.0, 0.1M sodium borate buffer solution at pH 9.0 to 10.0 and 0.1Msodium phosphate-sodium hydroxide buffer solution at pH 11.0 and theywere used for the determination. FIG. 17 illustrates the relationbetween casein hydrolyzing activity of the enzyme sample obtained by thepresent invention, PF-BS13 and pH. The ordinate indicates the relativeactivity (%) and the abscissa indicates pH. In Fig. 17, the open circle,the closed circle, the open square and the closed square represent theresults obtained by using the substrate solutions prepared with 0.1Msodium acetate buffer solution, 0.1M potassium phosphate buffersolution, 0.1M sodium borate buffer solution and 0.1M sodiumphosphate-sodeum hydroxide, respectively. As seen from FIG. 17, theenzyme sample, PF-BS13, showed casein decomposing activity at the pHrange of 5.0 to 11.0 and its optimum pH was pH 9.0 to 10.0.

As described hereinabove in detail, according to the present invention,the genes encoding the hyperthermostable proteases and the industrialprocess for producing the hyperthermostable proteases using the genescan be provided. The enzymes have high thermostability and also showresistance to surfactants. Therefore, they are particularly useful fortreatment of proteins at high temperatures.

In addition, a DNA fragment obtained by hybridization with the geneisolated by the present invention or a part of the nucleotide sequenceof the isolated gene as a probe can be transduced into a suitablemicroorganism and its heat-treated lysate can be prepared according tothe same manner as that described with respect to the cosmid proteinlibrary. Then, a protease activity is tested by an appropriate method.In this manner, a hyperthermostable protease gene encoding an enzymewhose sequence is not identical with that of the above enzyme but whichhas a similar activity can be obtained.

The above hybridization can be carried out under the followingconditions. Namely, DNA fixed on a membrane is incubated in 6×SSCcontaining 0.5% of SDS, 0.1% of bovine serum albumin, 0.1% of polyvinylpyrrolidone, 0.1% of Ficoll 400 and 0.01% denatured salmon sperm DNA(1×SSC represents 0.15M NaCl and 0.015M sodium citrate, pH 7.0) togetherwith a probe at 50° C. for 12 to 20 hours. After completion ofincubation, the membrane is washed in such a manner that washing isstarted with 2×SSC containing 0.5% SDS at 37° C., followed by changingSSC concentrations within the range to 0.1×and varying temperatures upto 50° C., until a signal from the fixed DNA can be distinguished fromthe background signal.

Furthermore, the gene isolated by the present invention, a DNA fragmentobtained by in vitro gene amplification using a part of the isolatedgene as a primer, or a DNA fragment obtained by hybridization using thefragment obtained by the above amplification as a probe is transducedinto a suitable microorganism and, according to the same manner asdescribed above, a protease activity is determined. In this manner, ahyperthermostable protease gene encoding an enzyme whose activity is notidentical with that of the above enzyme but is similar can be obtained.

The following examples further illustrates the present invention but arenot to be construed to limit the scope thereof. In the examples, all the"percents" are by weight.

EXAMPLE 1

Preparation of genomic DNA of Pyrococcus furiosus

Pyrococcus furiosus DSM3638 was cultured as follows.

A culture medium composed of 1% of trypton, 0.5% of yeast extract, 1% ofsoluble starch, 3.5% of Jamarin S·Solid (manufactured by JamarinLaboratory), 0.5% of Jamarin S·Liquid (manufactured by JamarinLaboratory), 0.003% of MgSO₄, 0.001% of NaCl, 0.0001% of FeSO₄ ·7H₂ O,0.0001% of COSO₄, 0.0001% of CaCl₂ ·7H₂ O, 0.0001% of ZnSO₄, 0.1 ppm ofCuSO₄ ·5H₂ O, 0.1 ppm of KAl (SO₄)₂, 0.1 ppm of H₃ BO₃, 0.1 ppm of Na₂MoO₄ ·2H₂ O and 0.25 ppm of NiCl₂ ·6H₂ O was placed in a 2 liter-mediumbottle and sterilized at 120° C. for 20 minutes. Then, nitrogen gas wasblown into the medium to purge out dissolved oxygen and the abovebacterial strain was inoculated into the medium, followed by subjectingto stationary culture at 95° C. for 16 hours. After completion ofculture, bacterial cells were collected by centrifugation.

Then, the collected cells were suspended into 4 ml of 0.05M Tris-HCl (pH8.0) containing 25% of sucrose and to this suspension were added 0.8 mlof lysozyme 5 mg/ml, 0.25M Tris-HCl (pH 8.0)! and 2 ml of 0.2M EDTA. Themixture was incubated at 20° C. for 1 hour. Then, to the mixture wereadded 24 ml of SET solution 150 mM NaCl, 1 mM EDTA and 20 mM Tris-HCl(pH 8.0)! and further 4 ml of 5% SDS and 400 μl of Proteinase K (10mg/ml) and the mixture was incubated at 37° C. for 1 hour. Aftercompletion of the reaction, the reaction mixture was subjected tophenol-chloroform extraction and then ethanol precipitation to prepareabout 3.2 mg of genomic DNA.

Preparation of cosmid protein library 400 μg of Genomic DNA ofPyrococcus furiosus DSM3633 was partially digested with Sau3AI andsubjected to size-fractionation in size of 35 to 50 kb bydensity-gradient ultra-centrifugation. Then, 1 μg of a triple helixcosmid vector was digested with XbaI, dephosphorylated with alkalinephosphatase (manufactured by Takara Shuzo Co., Ltd.) and furtherdigested with BamHI. The vector was ligated to 140 μg of the abovefractionated 35 to 50 kb DNA. The genomic DNA fragments of Pyrococcusfuriosus were packaged into lambda phage particles by in vitro packagingmethod using Gigapack Gold (manufactured by Stratagene) to prepare alibrary. Then, by using a part of the library thus obtained,transduction into E. coli DH5αMCR was carried out and, amongtransformants obtained, several transformants were selected to preparecosmid DNA. After confirmation of the presence of inserted fractionshaving suitable size, again about 500 transformants were selected fromthe above library and they were independently cultured in 150 ml of LBmedium (tripton 10 g/liter, yeast extract 5 g/liter, NaCl 5 g/liter, pH7.2) containing 100 μg/ml of ampicillin. Each culture was centrifuged,the recovered microbial cells were suspended in 1 ml of 20 mM Tris-HCl(pH 8.0) and the suspension was subjected heat treatment at 100° C. for10 minutes. Then, the suspension was sonicated and again subjected heattreatment at 100° C. for 10 minutes. The lysates obtained assupernatants after centrifugation were used as the cosmid proteinlibrary.

Selection of cosmid containing hyperthermostable protease gene

The protease activity was detected by testing for hydrolysis gelatin ina polyacrylamide gel.

Namely, 5 μl aliquots of the lysates from the above cosmid proteinlibrary were taken out and subjected to electrophoresis by using 0.1%SDS-10% polyacrylamide gel containing 0.05% of gelatin. After completionof electrophoresos, the gel was incubated in 50 mM potassium phosphatebuffer solution (pH 7.0) at 95° C. for 2 hours. The gel was stained in2.5% Coomassie Brilliant Blue-R-250, 25% ethanol and 10% acetic acid for30 minutes. Then, the gel was transferred to 25% methanol and 7% aceticacid to decolorize for 3 to 15 hours. Eight cosmid clones having theprotease activity, which shows the bands not stained with CoomassieBrilliant Blue-R-250 due to hydrolysis of gelatin on the gel wereselected.

Preparation of plasmid pTPR1 containing hyperthermostable protease gene

Among the 8 cosmid clones having the protease activity, one cosmid(cosmid No. 304) was selected to prepare cosmid DNA and the cosmid DNAwas digested with SphI and then ligated to SphI site of the plasmidvector pUC119. This recombinant plasmids were transduced into E. coliJM109 and the protease activity of the resultant transformants weretested according to the same method as that used for screening of thecosmid protein library. A plasmid was prepared from the transformanthaving the protease activity and the resultant recombinant plasmid wasnamed as pTPR1. E. coli JM109 transformed with the plasmid was named asEscherichia JM109/pTPR1.

FIG. 1 illustrates a restriction map of the plasmid pTPR1.

Preparation of plasmid pTPR9 containing

hyperthermostable protease gene

The above plasmid pTPR1 was digested with XbaI and subjected to agarosegel-electrophoresis to separate three DNA fragments of about 2.5 kb,about 3.3 kb and about 4.3 kb. Among three fragments thus separated, twofragments of about 3.3 kb and about 4.3 kb were recovered. The DNAfragment of about 4.3 kb was dephosphorylated with alkaline phosphatase(manufactured by Takara Shuzo Co., Ltd.) and then was mixed with the DNAfragment of about 3.3 kb to ligate to each other. This was transducedinto E. coli JM109. The protease activity of the resultant transformantswere tested by the same method as that used for screening of the cosmidprotein library. A plasmid was prepared from the transformant having theprotease activity. The plasmid was named as pTPR 9 and E. coli JM109transformed with the plasmid was named as Escherichia coli JM109/pTPR9.

FIG. 2 illustrates a restriction map of the plasmid pTPR 9.

Detection of DNA fragment containing whole length of hyperthermostableprotease gene

The cosmid DNA used in the preparation of the above plasmid pTPR1 wasdigested with NotI and then further digested with BamHI, Blnl, EcoT22,Nsp(7524)V, PvuII, SalI, SmaI and SpeI, respectively. Then, digested DNAwas subjected to electrophoresis on a 0.8% agarose gel. Afterelectrophoresis, the gel was soaked in 0.5N NaOH containing 1.5M NaCl todenature the DNA fragments in the gel and then the gel was neutralizedin 0.5M Tris-HCl (pH 7.5) containing 3M NaCl. The DNA fragments in thegel was blotted on a Hybond-N⁺ nylon membrane (manufactured by Amasham)by Southern blotting. After blotting, the membrane was washed with 6×SSC(1×SSC represents 0.15M NaCl, 0.015M sodium citrate, pH 7.0) andair-dried and DNA was fixed on the membrane by UV irradiation using a UVtransilluminator for 3 minutes.

On the other hand, the plasmid pTPR9 was digested with PstI and XbaI andsubjected to electrophoresis on a 1% agarose gel and the separated DNAfragment of about 0.7 kb was recovered. A ³² P-labeled DNA probe wasprepared by using the DNA fragment as a template and using a randomprimer DNA labeling kit Ver2 (manufactured by Takara Shuzo Co., Ltd.)and α-³² P!dCTP (manufactured by Amasham).

The above membrane to which the DNA was fixed was treated in ahybridization buffer solution (6×SCC containing 0.5% SDS, 0.1% bovineserum albumin, 0.1% polyvinyl pyrrolidone, 0.1% Ficoll 400 and 0.01%denatured salmon sperm) at 68° C. for 2 hours. Then, it was transferredin a similar hybridization buffer solution containing the ³² P-labeledDNA probe to allow to hybridize at 68° C. for 14 hours. After completionof hybridization, the membrane was washed with 2×SSC containing 0.5% ofSDS at room temperature and then 0.1×SSC containing 0.5% of SDS at 68°C. After rinsing the membrane with 0.1×SSC, it was air-dried. A X-rayfilm was exposed to the membrane at -80° C. for 60 hours. The film wasdeveloped to prepare an autoradiogram. This autoradiogram showed that aprotease gene was present in the DNA fragment of about 7.5 kb obtainedby digestion of the cosmid DNA with NotI and PvuII.

Preparation of plasmid pTPR12 containing whole length ofhyperthermostable protease gene

The cosmid DNA used for the preparation of the above plasmid pTPR1 wasdigested with Not I and PvuII and subjected to electrophoresis using a0.8% agarose gel to recover DNA fragments of about 7 to 8 kb alltogether. These DNA fragments were mixed with the plasmid vector pUC19into which a Not I linker was introduced at HincII site and which wasdigested with NotI and SmaI. Then, ligation was carried out. Therecombinant plasmids were transduced into E. coli JM109 and the proteaseactivity of the resultant transformants were tested by the same methodas that used for screening of the cosmid protein library. A plasmid wasprepared from the transformant having the protease activity. The plasmidwas named as pTPR12 and E. coli JM109 transformed with the plasmid wasdesignated as Escherichia coli JM109/pTPR12.

FIG. 3 illustrates a restriction map of the plasmid pTPR12.

Preparation of plasmid pTPR 15 containing whole length ofhyperthermostable protease gene

The above plasmid pTPR 12 was digested with XbaI and subjected toelectrophoresis using a 1% agarose gel to recover separated two DNAfragments of about 3.3 kb and about 7 kb, respectively. Then, the DNAfragments of about 7 kb thus recovered was digested with KpnI and againsubjected to electrophoresis using a 1% agarose gel to separate twofragments of about 3.2 kb and about 3.8 kb. In these fragments, the DNAfragment of about 3.2 kb was recovered and ligated to the plasmid vectorpUC19 digested with XbaI and KpnI. This was transduced into E. coliJM109. Plasmids held by the resultant transformants were prepared andthe plasmid containing only one molecular of the above 3.2 kb fragmentwas selected. This was named as pTPR14.

FIG. 5 illustrates a restriction map of the plasmid pTPR 14.

Then, the above plasmid pTPR 14 was digested with XbaI anddephosphorylated using alkaline phosphatase. This was mixed with theabove fragment of about 3.3 kb to carry out ligation and was transducedinto E. coli JM109. The protease activity of the resultant transformantswere tested by using the same method as that used for screening of thecosmid protein library. A plasmid was prepared from the transformanthaving the protease activity. This plasmid was named as pTPR15 and E.coli JM109 transformed with the plasmid was named as Escherichia coliJM109/pTPR15.

FIG. 6 illustrates a restriction map of the plasmid pTPR 15.

EXAMPLE 2

Determination of nucleotide sequence of hyperthermostable protease gene

For determination of the nucleotide sequence of the hyperthermostableprotease gene inserted into the above plasmid pTPR 15, deletion mutantswherein the DNA fragment portion inserted into the plasmid had beendeleted in various lengths were prepared by using Kilo sequence deletionkit (manufactured by Takara Shuzo Co., Ltd.). Among them, severalmutants having suitable lengths of deletion were selected and nucleotidesequences of respective inserted DNA fragment portions were determinedby dideoxy method using BcaBEST dideoxy sequencing kit (manufactured byTakara Shuzo Co., Ltd.). By putting these results together, nucleotidesequences of the inserted DNA fragment contained in the plasmid pTPR15were determined. Among the nucleotide sequences thus obtained, SEQ ID NO8 of the Sequence Listing shows the fragment of 4765 bp between two DraIsites. Furthermore, SEQ ID NO 9 shows an amino acid sequence of thehyperthermostable protease encoded by the open reading frame containedin the above nucleotide sequence.

Preparation of plasmid pTPR13 containing hyperthermostable protease gene

The above plasmid pTPR15 was digested with DraI and subjected to 1%agarose gel-electrophoresis, followed by recovering the separated DNAfragment of about 4.8 kb. Then, the plasmid vector pUC19 was digestedwith SmaI and, after dephosphorylation with alkaline phosphatase, it wasmixed with the above DNA fragment of about 4.8 kb to carry out ligationand transduced into E. coli JM109. The protease activity of theresultant transformants were tested by the same method as that used forscreening of the cosmid protein library. A plasmid was prepared from atransformant having the activity. The plasmid was named as pTPR13 and E.coli JM109 transformed with the plasmid was named as Escherichia coliJM109/pTPR13.

FIG. 7 illustrates a restriction map of the plasmid pTPR13.

Preparation of plasmid pUBP13 containing hyperthermostable protease genefor transforming Bacillus subtilis

The above plasmid pTPR13 was digested with KpnI and BamHI and thensubjected to 1% agarose gel-electrophoresis, followed by recovering theseparated DNA fragment of about 4.8 kb. Then, the plasmid vectorpUB18-P43 was digested with KpnI and BamHI and mixed with the above DNAfragment of about 4.8 kb to carry out ligation. It was transduced intoBacillus subtilis DB104. The protease activity of the resultanttransformants having kanamycin resistance were tested by the same methodas that used for screening of the cosmid protein library. A plasmid wasprepared from a transformant having the activity. The plasmid was namedas pUBP13 and Bacillus subtilis DB104 transformed with the plasmid wasnamed as Bacillus subtilis DB1049/pUBP13.

FIG. 8 illustrates a restriction map of the plasmid pUBP13.

Preparation of plasmid pTPR36 containing hyperthermostable protease genedefecting in its back half portion

The above plasmid pTPR13 was digested with EcoRI and the resultant endwas blunted with a DNA blunting kit (manufactured by Takara Shuzo Co.,Ltd.). Further, it was digested with KpnI and subjected to 1% agarosegel-electrophoresis, followed by recovering the separated DNA fragmentof about 2.8 kb. Next, the plasmid vector pUC119 was digested with XbaIand the resultant end was blunted and further digested with KpnI,followed by mixing with the above DNA fragment of 2.8 kb to carry outligation and transducing into E. coli JM109.

The protease activity of the resultant transformants were tested by thesame method as that used for screening of the cosmid protein library. Aplasmid was prepared from a transformant having the activity. Theplasmid was named as pTPR36 and E. coli JM109 transformed with theplasmid was named as Escherichia coli JM109/pTPR36.

FIG. 9 illustrates a restriction map of the plasmid pTPR36. SEQ ID NO 2of the Sequence Listing shows the nucleotide sequence of the DNAfragment inserted into the plasmid pTPR36. Also, SEQ ID NO 1 shows anamino acid sequence of the hyperthermostable protease which can beencoded by the nucleotide sequence.

EXAMPLE 3

Preparation of oligonucleotide for detection of hyperthermostableprotease gene

By comparing the estimated amino acid sequence of the hyperthermostableprotease of the present invention obtained in Example 2 with amino acidsequences of known alkaline serine proteases originating inmicroorganisms, it was found that there were homologous amino acidsequences commonly present in these enzymes. Among them, three regionswere selected and oligonucleotides to be used as primers in detection ofhyperthermostable protease genes by PCR were designed.

FIGS. 10, 11 and 12 illustrate the relation among the amino acidsequences corresponding to the above three regions of thehyperthermostable protease of the present invention, nucleotidesequences of the hyperthermostable protease of the present inventionwhich encode the above regions, and the nucleotide sequences ofoligonucleotides PRO-1F, PRO-2F, PRO-2R and PRO-4R synthesized based onthe above nucleotide sequences. Also, SEQ NO. 3, 4, 5 and 6 shownucleotide sequences of PRO-1F, PRO-2F, PRO-2R and PRO-4R, respectively.

Preparation of genomic DNA of Thermococcus celer

Microbial cells were collected from 10 ml of a culture broth ofThermococcus celer DSM2476 obtained from Deutsch Sammlung vonMicroorganismen und Zellkulturen GmbH by centrifugation and suspended in100 μl of 50 mM Tris-HCl (pH 8.0) containing 25% sucrose. To thesuspension were added 20 μl of 0.5M EDTA and 10 μl of lysozyme (10mg/ml) and the suspension was incubated at 20° C. for 1 hour. To thiswere added 800 μl of SET solution (150 mM NaCl, 1 mM EDTA, 20 mMTris-HCl, pH 8.0), 50 μl of 10% SDS and 10 μl of Proteinase K (20 mg/ml)and the suspension was further incubated at 37° C. for 1 hour.Chloroform-phenol extraction was carried out to stop the reaction. Thereaction mixture was subjected to ethanol precipitation and recoveredDNA was dissolved in 50 μl of TE buffer solution to obtained a genomicDNA solution.

Detection of hyperthermostable protease by PCR

A PCR reaction mixture was prepared from the above genomic DNA ofThermococcus celer and the oligonucleotides PRO-1F and PRO-2R or theoligonucleotides PRO-2F and PRO-4R and a PCR reaction (one cycle: 94° C.for 1 minute-55° C. for 1 minute-72° C. for 1 minute, 35 cycles) wascarried out. When aliquots of the reaction mixture were subjected toagarose gel-electrophoresis, amplification of three DNA fragments incase of using the oligonucleotides PRO-1F and PRO-2R and one DNAfragment in case of using the oligonucleotides PRO-2F and PRO-4R wasobserved. These amplified fragments were recovered from the agarose geland their DNA ends were blunted by a DNA blunting kit, followed byphosphorylating thereof with T4 polynucleotide kinase (manufactured byTakara Shuzo Co., Ltd.). Then, the plasmid vector pUC18 was digestedwith HincII and subjected to dephosphorylation with alkalinephosphatase. It was mixed with the above PCR amplified DNA fragments tocarry out ligation and then transduced into E. coli JM109. Plasmids wereprepared from the resultant transformants and plasmids into whichsuitable DNA fragments were inserted were selected. Nucleotide sequencesof the inserted DNA fragments were determined by dideoxy method. Amongthese plasmids, regarding a plasmid p1F-2R(2) containing a DNA fragmentof about 150 bp which was amplified by using the oligonucleotides PRO-1Fand PRO-2R and a plasmid p2F-4R containing a DNA fragment of about 550bp which was amplified by using the oligonucleotides PRO-2F and PRO-4R,it was found that amino acid sequences estimated from the thus-obtainednucleotide sequences contained sequences having homology with the aminoacid sequences of the hyperthermostable protease originating inPyrococcus furiosus of the present invention, subtilisin and the like.

SEQ NO 10 of the Sequence Listing shows the nucleotide sequence of theDNA fragment inserted into the plasmid p1F-2R(2) and an amino acidsequence deduced from the nucleotide sequence. Also, SEQ NO 11 of theSequence Listing shows the nucleotide sequence of the DNA fragmentinserted into the plasmid p2F-4R and an amino acid sequence deduced fromthe nucleotide sequence. In the nucleotide sequence shown by SEQ NO 10of the Sequence Listing, the sequence from the first to 21st nucleotidesand that from the 113th to 145th nucleotides and, in the SEQ NO 11 ofthe Sequence Listing, the sequence from the first to the 32ndnucleotides and that from the 532nd to the 564th nucleotides are thesequences of the primers used in the PCR (corresponding to theoligonucleotides PRO-1F, PRO-2R, PRO-2F and PRO-4R, respectively).

FIG. 13 illustrates a restriction map of the plasmid p2F-4R.

Screening of protease gene originating in Thermococcus celer

The above genomic DNA of Thermococcus celer was partially digested withSau3AI and was treated with Klenow fragment (manufactured by TakaraShuzo Co., Ltd.) in the presence of dATP and dGTP to partially repairthe DNA ends. The DNA fragments were mixed with a lambda GEM-11 XhoIhalf site arm vector (manufactured by Promega) to carry out ligation.Then, they were subjected to in vitro packaging using Gigapack Gold toprepare a lambda phage library containing genomic DNA of Thermococcusceler. A part of the library was transduced into E. coli LE392 to formplaques on a plate and the plaques were transferred on a Hybond-N⁺-membrane. After transfer, the membrane was treated with 0.5 N NaOHcontaining 1.5M NaCl and then 0.5M Tris-HCl (pH 7.5) containing 3M NaCl.Further, it was washed with 6×SCC, air-dried and irradiated with UVlight on a UV transilluminator to fix phage DNA on the membrane.

On the other hand, the plasmid p2F-4R was digested with PmaCI(manufactured by Takara Shuzo Co., Ltd.) and StuI (manufactured byTakara Shuzo Co., Ltd.) and subjected to 1% agarose gel-electrophoresisto recover the separated DNA fragment of about 0.5 kb. By using thisfragment as a template and using a random primer DNA labeling kit Ver2and α-³² P!dCTP, a ³² P-labeled DNA probe was prepared.

The above membrane having DNA fixed thereon was treated in ahybridization buffer solution (6×SSC containing 0.5% SDS, 0.1% bovineserum albumin, 0.1% polyvinyl pyrrolidone, 0.1% Ficoll 400 and 0.01%denatured salmon sperm DNA) at 50° C. for 2 hours. It was transferred tothe same buffer solution containing the ³² P-labeled DNA prove andhybridization was carried out at 50° C. for 15 hours. After completionof hybridization, the membrane was washed with 2×SSC containing 0.5% SDSat room temperature and then 1×SSC containing 0.5% SDS at 50° C.Further, after rinsing the membrane with 1×SCC, it was air-dried and aX-ray film was exposed thereto at -80° C. for 6 hours to prepare anautoradiogram. About 4,000 phage clones were screened. As a result, onephage clone containing a protease gene was obtained. Based on the signalon the autoradiogram, the position of this phage clone was found and theplaque corresponding on the plate used for transfer to the membrane wasisolated into 1 ml of SM buffer solution 50 mM Tris-HCl, 0.1M NaCl, 8 mMMgSO₄, 0.01% gelatin (pH 7.5)! containing 1% of chloroform.

Detection of phage DNA fragment containing protease gene

The above phage clone was transduced in to E. coli LE392 and thetransformant was cultured in NZCYM medium (manufactured by Bio 101) at37° C. for 15 hours to obtain a culture broth. A supernatant of theculture broth was collected and phage DNA was prepared by usingQIAGEN-lambda kit (manufactured by DIAGEN). The resultant phage DNA wasdigested with BamHI, EcoRI, EcoRV, HincII, KpnI, NcoI, PstI, SacI, SalI,SmaI and SphI (all manufactured by Takara Shuzo Co., Ltd.),respectively, and subjected to 1% agarose-electro-phoresis. Then, amembrane on which DNA fragments were fixed was prepared by the samemethod as that used for the detection of the DNA fragment containing thewhole length of the hyperthermostable protease gene of Example 1. Themembrane was treated in a hybridization buffer solution at 50° C. for 4hours and then transferred to the same hybridization buffer solutioncontaining the same ³² P-labeled DNA probe as that used in the abovescreening of the protease gene derived form Thermococcus celer. Then,hybridization was carried out at 50° C. for 18 hours. After completionof hybridization, the membrane was washed with 1×SSC containing 0.5% SDSat 50° C. and rinsed with 1×SCC. The membrane was air-dried and exposedto a X-ray film at -80° C. for 2 hours to prepare an autoradiogram.According to this autoradiogram, it was found that, in the phage DNAdigested with KpnI, the protease gene was contain in a DNA fragment ofabout 9 kb.

Preparation of plasmid pTC1 containing protease gene

The above phage DNA containing the protease gene was digested with KpnIand subjected to 1% agarose gel-electrophoresis to recover a DNAfragment of about 9 kb from the gel. Then, the plasmid vector pUC119 wasdigested with KpnI and dephosphorylated with alkaline phosphatase,followed by mixing with the above DNA fragment of about 9 kb to carryout ligation. Then, it was transduced into E. coli JM109. Plasmids wereprepared from the resultant transformants and a plasmid containing onlythe above DNA fragment of about 9 kb was selected. This plasmid wasnamed as pTC1 and E. coli JM109 transformed with the plasmid was namedwith Escherichia coli JM109/pTC1.

FIG. 14 illustrates a restriction map of the plasmid pTC1.

Preparation of plasmid pTC3 containing hyperthermostable protease gene

The above plasmid pTC1 was digested with KpnI and further digested withBamHI, PstI and SphI, respectively. After subjecting to 1% agarosegel-electrophoresis, according to the same operation as that fordetecting the phage DNA fragment containing the above protease gene,transfer of DNA fragments to a membrane and detection of DNA fragmentscontaining the hyperthermostable protease gene were carried out. By thesignal on the resultant autoradiogram, it was shown that a DNA fragmentof about 5 kb which obtained by digesting the plasmid pTC1 with KpnI andBamHI contained the hyperthermostable protease gene.

Then, the plasmid pTC1 was digested with KpnI and BamHI and thensubjected to 1% agarose gel-electrophoresis to separate and isolate aDNA fragment of about 5 kb. The plasmid vector pUC119 was digested withKpnI and BamHI and mixed with the above DNA fragment of about 5 kb tocarry out ligation. It was transduced into E. coli JM109. Plasmids wereprepared form the resultant transformants and a plasmid containing theabove DNA fragment of about 5 kb. This plasmid was named as pTC3 and E.coli JM109 transformed with the plasmid was named as Escherichia coliJM109/pTC3.

FIG. 15 illustrates a restriction map of the plasmid pTC3.

Determination of nucleotide sequence of hyperthermostable protease genecontained in Plasmid pTC3

For determination of the nucleotide sequence of the hyperthermostableprotease gene contained in the above plasmid pTC3, 6 oligonucleotideswere synthesized based on the nucleotide sequences shown by SEQ ID NO 10and 11 of the Sequence Listing, respectively. The nucleotide sequencesof the synthesized oligonucleotides TCE-2, TCE-4, SEF-3, SER-1, SER-3and TCE-6R were shown by SEQ ID NO 12, 13, 14, 15, 16 and 17 of theSequence Listing. The results obtained by dideoxy method using the aboveoligonucleotides as primers and the plasmid pTC3 as a template weresummarized to determine the nucleotide sequence of the hyperthermostableprotease gene.

SEQ ID NO 7 of the Sequence Listing shows a part of the resultantnucleotide sequence. In addition, SEQ ID NO 18 of the Sequence Listingshows an deduced amino acid sequence encoded by the nucleotide sequence.

EXAMPLE 4

Preparation of enzyme sample Escherichia coli JM109/pTPR36 which was E.coli JM109 into which the plasmid pTPR36 containing thehyperthermostable protease gene of the present invention obtained inExample 2 was transduced was cultured with shaking in 5 ml of LB medium(trypton 10 g/liter, yeast extract 5 g/liter, NaCl 5 g/liter, pH 7.2)containing 100 μg/ml of ampicillin at 37° C. for 14 hours. In a 1liter-Erlenmeyer flask, 200 ml of the same medium was prepared and 2 mlof the above culture broth was inoculated and cultured with shaking at37° C. for 10 hours. The culture broth was centrifuged. The harvestedmicrobial cells (wet weight 1.6 g) were suspended in 2 ml of 20 mMTris-HCl (pH 8.0), sonicated and centrifuged to obtain a supernatant.The supernatant was treated at 100° C. for 5 minutes and centrifugedagain. The resultant supernatant was used as a crude enzyme solution(enzyme sample PF-36).

In addition, according to the same manner, Escherichia coli JM109/pTPR13which was E. coli JM109 into which the plasmid pTPR13 containing thehyperthermostable protease gene of the present invention was transducedwas used to prepare a crude enzyme solution (enzyme sample PF-13).

Moreover, Bacillus subtilis DB104/pUBP13 which was Bacillus subtilisDB104 into which the plasmid pUBP13 containing the hyperthermostableprotease gene of the present invention was transduced was cultured withshaking in 5 ml of LB medium containing 10 μg/ml of kanamycin at 37° C.for 14 hours. In two 2 liter-Erlenmeyer flasks, respective 600 ml of thesame mediums were prepared. To each flask was inoculated with 2 ml ofthe above culture broth and cultured with shaking at 37° C. for 26hours. The culture broth was centrifuged. The resultant microbial cellswere suspended in 15 ml of 20 mM Tris-HCl (pH 8.0), sonicated andcentrifuged to obtain a supernatant. The supernatant was treated at 100°C. for 5 minutes and centrifuged again. To the resultant supernatant wasadded ammonium sulfate to 50% saturation and then the resultantprecipitate was recovered by centrifugation. The recovered precipitatewas suspended in 2 ml of 20 mM Tris-HCl (pH 8.0) and the suspension wasdialyzed against the same buffer solution. The resultant inner solutionwas used as a partially purified enzyme sample (enzyme sample PF-BS13).

The protease activity of these enzyme samples and the cosmid clonelysate used for preparation of plasmids were tested according to theabove method for detection of enzyme activity using SDS-polyacrylamidegel containing gelatin.

FIG. 16 illustrates the thermostability of the hyperthermostableprotease obtained by the present invention. And, FIG. 17 illustrates theoptimum pH of the hyperthermostable protease obtained by the presentinvention. Further, FIG. 18 illustrates the results of activity stainingafter SDS-polyacrylamide gel electrophoresis of each sample (enzymesamples PF-36, PF-13 and PF-BS13 and the lysate). Each sample showsactivity at 95° C. in the presence of SDS.

As described hereinabove, according to the present invention, genesencoding hyperthermostable proteases which show activity at 95° C. wereobtained. These genes make possible to supply a large amount of ahyperthermostable protease having high purity.

    __________________________________________________________________________    SEQUENCE LISTING                                                              (1) GENERAL INFORMATION:                                                      (iii) NUMBER OF SEQUENCES: 18                                                 (2) INFORMATION FOR SEQ ID NO:1:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 903 amino acids                                                   (B) TYPE: amino acid                                                          (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: peptide                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:1:                                       MetAsnLysLysGlyLeuThrValLeuPheIleAlaIleMetLeuLeu                              151015                                                                        SerValValProValHisPheValSerAlaGluThrProProValSer                              202530                                                                        SerGluAsnSerThrThrSerIleLeuProAsnGlnGlnValValThr                              354045                                                                        LysGluValSerGlnAlaAlaLeuAsnAlaIleMetLysGlyGlnPro                              505560                                                                        AsnMetValLeuIleIleLysThrLysGluGlyLysLeuGluGluAla                              65707580                                                                      LysThrGluLeuGluLysLeuGlyAlaGluIleLeuAspGluAsnArg                              859095                                                                        ValLeuAsnMetLeuLeuValLysIleLysProGluLysValLysGlu                              100105110                                                                     LeuAsnTyrIleSerSerLeuGluLysAlaTrpLeuAsnArgGluVal                              115120125                                                                     LysLeuSerProProIleValGluLysAspValLysThrLysGluPro                              130135140                                                                     SerLeuGluProLysMetTyrAsnSerThrTrpValIleAsnAlaLeu                              145150155160                                                                  GlnPheIleGlnGluPheGlyTyrAspGlySerGlyValValValAla                              165170175                                                                     ValLeuAspThrGlyValAspProAsnHisProPheLeuSerIleThr                              180185190                                                                     ProAspGlyArgArgLysIleIleGluTrpLysAspPheThrAspGlu                              195200205                                                                     GlyPheValAspThrSerPheSerPheSerLysValValAsnGlyThr                              210215220                                                                     LeuIleIleAsnThrThrPheGlnValAlaSerGlyLeuThrLeuAsn                              225230235240                                                                  GluSerThrGlyLeuMetGluTyrValValLysThrValTyrValSer                              245250255                                                                     AsnValThrIleGlyAsnIleThrSerAlaAsnGlyIleTyrHisPhe                              260265270                                                                     GlyLeuLeuProGluArgTyrPheAspLeuAsnPheAspGlyAspGln                              275280285                                                                     GluAspPheTyrProValLeuLeuValAsnSerThrGlyAsnGlyTyr                              290295300                                                                     AspIleAlaTyrValAspThrAspLeuAspTyrAspPheThrAspGlu                              305310315320                                                                  ValProLeuGlyGlnTyrAsnValThrTyrAspValAlaValPheSer                              325330335                                                                     TyrTyrTyrGlyProLeuAsnTyrValLeuAlaGluIleAspProAsn                              340345350                                                                     GlyGluTyrAlaValPheGlyTrpAspGlyHisGlyHisGlyThrHis                              355360365                                                                     ValAlaGlyThrValAlaGlyTyrAspSerAsnAsnAspAlaTrpAsp                              370375380                                                                     TrpLeuSerMetTyrSerGlyGluTrpGluValPheSerArgLeuTyr                              385390395400                                                                  GlyTrpAspTyrThrAsnValThrThrAspThrValGlnGlyValAla                              405410415                                                                     ProGlyAlaGlnIleMetAlaIleArgValLeuArgSerAspGlyArg                              420425430                                                                     GlySerMetTrpAspIleIleGluGlyMetThrTyrAlaAlaThrHis                              435440445                                                                     GlyAlaAspValIleSerMetSerLeuGlyGlyAsnAlaProTyrLeu                              450455460                                                                     AspGlyThrAspProGluSerValAlaValAspGluLeuThrGluLys                              465470475480                                                                  TyrGlyValValPheValIleAlaAlaGlyAsnGluGlyProGlyIle                              485490495                                                                     AsnIleValGlySerProGlyValAlaThrLysAlaIleThrValGly                              500505510                                                                     AlaAlaAlaValProIleAsnValGlyValTyrValSerGlnAlaLeu                              515520525                                                                     GlyTyrProAspTyrTyrGlyPheTyrTyrPheProAlaTyrThrAsn                              530535540                                                                     ValArgIleAlaPhePheSerSerArgGlyProArgIleAspGlyGlu                              545550555560                                                                  IleLysProAsnValValAlaProGlyTyrGlyIleTyrSerSerLeu                              565570575                                                                     ProMetTrpIleGlyGlyAlaAspPheMetSerGlyThrSerMetAla                              580585590                                                                     ThrProHisValSerGlyValValAlaLeuLeuIleSerGlyAlaLys                              595600605                                                                     AlaGluGlyIleTyrTyrAsnProAspIleIleLysLysValLeuGlu                              610615620                                                                     SerGlyAlaThrTrpLeuGluGlyAspProTyrThrGlyGlnLysTyr                              625630635640                                                                  ThrGluLeuAspGlnGlyHisGlyLeuValAsnValThrLysSerTrp                              645650655                                                                     GluIleLeuLysAlaIleAsnGlyThrThrLeuProIleValAspHis                              660665670                                                                     TrpAlaAspLysSerTyrSerAspPheAlaGluTyrLeuGlyValAsp                              675680685                                                                     ValIleArgGlyLeuTyrAlaArgAsnSerIleProAspIleValGlu                              690695700                                                                     TrpHisIleLysTyrValGlyAspThrGluTyrArgThrPheGluIle                              705710715720                                                                  TyrAlaThrGluProTrpIleLysProPheValSerGlySerValIle                              725730735                                                                     LeuGluAsnAsnThrGluPheValLeuArgValLysTyrAspValGlu                              740745750                                                                     GlyLeuGluProGlyLeuTyrValGlyArgIleIleIleAspAspPro                              755760765                                                                     ThrThrProValIleGluAspGluIleLeuAsnThrIleValIlePro                              770775780                                                                     GluLysPheThrProGluAsnAsnTyrThrLeuThrTrpTyrAspIle                              785790795800                                                                  AsnGlyProGluMetValThrHisHisPhePheThrValProGluGly                              805810815                                                                     ValAspValLeuTyrAlaMetThrThrTyrTrpAspTyrGlyLeuTyr                              820825830                                                                     ArgProAspGlyMetPheValPheProTyrGlnLeuAspTyrLeuPro                              835840845                                                                     AlaAlaValSerAsnProMetProGlyAsnTrpGluLeuValTrpThr                              850855860                                                                     GlyPheAsnPheAlaProLeuTyrGluSerGlyPheLeuValArgIle                              865870875880                                                                  TyrGlyValGluIleThrProSerValTrpTyrIleAsnArgThrTyr                              885890895                                                                     LeuAspThrAsnThrGluPhe                                                         900                                                                           (2) INFORMATION FOR SEQ ID NO:2:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 2835 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2:                                       TTTAAATTATAAGATATAATCACTCCGAGTGATGAGTAAGATACATCATTACAGTCCCAA60                AATGTTTATAATTGGAACGCAGTGAATATACAAAATGAATATAACCTCGGAGGTGACTGT120               AGAATGAATAAGAAGGGACTTACTGTGCTATTTATAGCGATAATGCTCCTTTCAGTAGTT180               CCAGTGCACTTTGTGTCCGCAGAAACACCACCGGTTAGTTCAGAAAATTCAACAACTTCT240               ATACTCCCTAACCAACAAGTTGTGACAAAAGAAGTTTCACAAGCGGCGCTTAATGCTATA300               ATGAAAGGACAACCCAACATGGTTCTTATAATCAAGACTAAGGAAGGCAAACTTGAAGAG360               GCAAAAACCGAGCTTGAAAAGCTAGGTGCAGAGATTCTTGACGAAAATAGAGTTCTTAAC420               ATGTTGCTAGTTAAGATTAAGCCTGAGAAAGTTAAAGAGCTCAACTATATCTCATCTCTT480               GAAAAAGCCTGGCTTAACAGAGAAGTTAAGCTTTCCCCTCCAATTGTCGAAAAGGACGTC540               AAGACTAAGGAGCCCTCCCTAGAACCAAAAATGTATAACAGCACCTGGGTAATTAATGCT600               CTCCAGTTCATCCAGGAATTTGGATATGATGGTAGTGGTGTTGTTGTTGCAGTACTTGAC660               ACGGGAGTTGATCCGAACCATCCTTTCTTGAGCATAACTCCAGATGGACGCAGGAAAATT720               ATAGAATGGAAGGATTTTACAGACGAGGGATTCGTGGATACATCATTCAGCTTTAGCAAG780               GTTGTAAATGGGACTCTTATAATTAACACAACATTCCAAGTGGCCTCAGGTCTCACGCTG840               AATGAATCGACAGGACTTATGGAATACGTTGTTAAGACTGTTTACGTGAGCAATGTGACC900               ATTGGAAATATCACTTCTGCTAATGGCATCTATCACTTCGGCCTGCTCCCAGAAAGATAC960               TTCGACTTAAACTTCGATGGTGATCAAGAGGACTTCTATCCTGTCTTATTAGTTAACTCC1020              ACTGGCAATGGTTATGACATTGCATATGTGGATACTGACCTTGACTACGACTTCACCGAC1080              GAAGTTCCACTTGGCCAGTACAACGTTACTTATGATGTTGCTGTTTTTAGCTACTACTAC1140              GGTCCTCTCAACTACGTGCTTGCAGAAATAGATCCTAACGGAGAATATGCAGTATTTGGG1200              TGGGATGGTCACGGTCACGGAACTCACGTAGCTGGAACTGTTGCTGGTTACGACAGCAAC1260              AATGATGCTTGGGATTGGCTCAGTATGTACTCTGGTGAATGGGAAGTGTTCTCAAGACTC1320              TATGGTTGGGATTATACGAACGTTACCACAGACACCGTGCAGGGTGTTGCTCCAGGTGCC1380              CAAATAATGGCAATAAGAGTTCTTAGGAGTGATGGACGGGGTAGCATGTGGGATATTATA1440              GAAGGTATGACATACGCAGCAACCCATGGTGCAGACGTTATAAGCATGAGTCTCGGTGGA1500              AATGCTCCATACTTAGATGGTACTGATCCAGAAAGCGTTGCTGTGGATGAGCTTACCGAA1560              AAGTACGGTGTTGTATTCGTAATAGCTGCAGGAAATGAAGGTCCTGGCATTAACATCGTT1620              GGAAGTCCTGGTGTTGCAACAAAGGCAATAACTGTTGGAGCTGCTGCAGTGCCCATTAAC1680              GTTGGAGTTTATGTTTCCCAAGCACTTGGATATCCTGATTACTATGGATTCTATTACTTC1740              CCCGCCTACACAAACGTTAGAATAGCATTCTTCTCAAGCAGAGGGCCGAGAATAGATGGT1800              GAAATAAAACCCAATGTAGTGGCTCCAGGTTACGGAATTTACTCATCCCTGCCGATGTGG1860              ATTGGCGGAGCTGACTTCATGTCTGGAACTTCGATGGCTACTCCACATGTCAGCGGTGTC1920              GTTGCACTCCTCATAAGCGGGGCAAAGGCCGAGGGAATATACTACAATCCAGATATAATT1980              AAGAAGGTTCTTGAGAGCGGTGCAACCTGGCTTGAGGGAGATCCATATACTGGGCAGAAG2040              TACACTGAGCTTGACCAAGGTCATGGTCTTGTTAACGTTACCAAGTCCTGGGAAATCCTT2100              AAGGCTATAAACGGCACCACTCTCCCAATTGTTGATCACTGGGCAGACAAGTCCTACAGC2160              GACTTTGCGGAGTACTTGGGTGTGGACGTTATAAGAGGTCTCTACGCAAGGAACTCTATA2220              CCTGACATTGTCGAGTGGCACATTAAGTACGTAGGGGACACGGAGTACAGAACTTTTGAG2280              ATCTATGCAACTGAGCCATGGATTAAGCCTTTTGTCAGTGGAAGTGTAATTCTAGAGAAC2340              AATACCGAGTTTGTCCTTAGGGTGAAATATGATGTAGAGGGTCTTGAGCCAGGTCTCTAT2400              GTTGGAAGGATAATCATTGATGATCCAACAACGCCAGTTATTGAAGACGAGATCTTGAAC2460              ACAATTGTTATTCCCGAGAAGTTCACTCCTGAGAACAATTACACCCTCACCTGGTATGAT2520              ATTAATGGTCCAGAAATGGTGACTCACCACTTCTTCACTGTGCCTGAGGGAGTGGACGTT2580              CTCTACGCGATGACCACATACTGGGACTACGGTCTGTACAGACCAGATGGAATGTTTGTG2640              TTCCCATACCAGCTAGATTATCTTCCCGCTGCAGTCTCAAATCCAATGCCTGGAAACTGG2700              GAGCTAGTATGGACTGGATTTAACTTTGCACCCCTCTATGAGTCGGGCTTCCTTGTAAGG2760              ATTTACGGAGTAGAGATAACTCCAAGCGTTTGGTACATTAACAGGACATACCTTGACACT2820              AACACTGAATTCTAG2835                                                           (2) INFORMATION FOR SEQ ID NO:3:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 35 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3:                                       GGWWSDRRTGTTRRHGTHGCDGTDMTYGACACSGG35                                         (2) INFORMATION FOR SEQ ID NO:4:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 32 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4:                                       KSTCACGGAACTCACGTDGCBGGMACDGTTGC32                                            (2) INFORMATION FOR SEQ ID NO:5:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 33 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:5:                                       ASCMGCAACHGTKCCVGCHACGTGAGTTCCGTG33                                           (2) INFORMATION FOR SEQ ID NO:6:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 34 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:                                       CHCCGSYVACRTGBGGAGWDGCCATBGAVGTDCC34                                          (2) INFORMATION FOR SEQ ID NO:7:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 898 base pairs                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:                                       GATCTGAAGGGCAAGGTCATAGGCTGGTACGACGCCGTCAACGGCAGGTCGACCCCCTAC60                GATGACCAGGGACACGGAACCCACGTTGCGGGTATCGTTGCCGGAACCGGCAGCGTTAAC120               TCCCAGTACATAGGCGTCGCCCCCGGCGCGAAGCTCGTCGGCGTCAAGGTTCTCGGTGCC180               GACGGTTCGGGAAGCGTCTCCACCATCATCGCGGGTGTTGACTGGGTCGTCCAGAACAAG240               GACAAGTACGGGATAAGGGTCATCAACCTCTCCCTCGGCTCCTCCCAGAGCTCCGACGGA300               ACCGACTCCCTCAGTCAGGCCGTCAACAACGCCTGGGACGCCGGTATAGTAGTCTGCGTC360               GCCGCCGGCAACAGCGGGCCGAACACCTACACCGTCGGCTCACCCGCCGCCGCGAGCAAG420               GTCATAACCGTCGGTGCAGTTGACAGCAACGACAACATCGCCAGCTTCTCCAGCAGGGGA480               CCGACCGCGGACGGAAGGCTCAAGCCGGAAGTCGTCGCCCCCGGCGTTGACATCATAGCC540               CCGCGCGCCAGCGGAACCAGCATGGGCACCCCGATAAACGACTACTNCAACAAGGGCTCT600               GGATCCAGCATGGACACCCCGCACGTTTCGGGCGTTGGCGGGCTCATCCTCCAGGCCCAC660               CCGAGCTGGACCCCGGACAAGGTGAAGACGCCCTCATCGAGACCGCCGACATAGTCGNCC720               CCAAGGAGATAGCGGACATCGCCTACGGTGCGGGTAGGGTGAACGTCTTCAAGGGCATCA780               AGTNCGACGACTACGNCAAGNTCACCTTCACCGGNTCCGTCGGCGACAAGGGAAGGGGCA840               CCACACCTTCGACGTCAGNGGGGGCACTTCGTGAACGNCACCCTCTNCTNGGACANGG898                 (2) INFORMATION FOR SEQ ID NO:8:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 4765 base pairs                                                   (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8:                                       TTTAAATTATAAGATATAATCACTCCGAGTGATGAGTAAGATACATCATTACAGTCCCAA60                AATGTTTATAATTGGAACGCAGTGAATATACAAAATGAATATAACCTCGGAGGTGACTGT120               AGAATGAATAAGAAGGGACTTACTGTGCTATTTATAGCGATAATGCTCCTTTCAGTAGTT180               CCAGTGCACTTTGTGTCCGCAGAAACACCACCGGTTAGTTCAGAAAATTCAACAACTTCT240               ATACTCCCTAACCAACAAGTTGTGACAAAAGAAGTTTCACAAGCGGCGCTTAATGCTATA300               ATGAAAGGACAACCCAACATGGTTCTTATAATCAAGACTAAGGAAGGCAAACTTGAAGAG360               GCAAAAACCGAGCTTGAAAAGCTAGGTGCAGAGATTCTTGACGAAAATAGAGTTCTTAAC420               ATGTTGCTAGTTAAGATTAAGCCTGAGAAAGTTAAAGAGCTCAACTATATCTCATCTCTT480               GAAAAAGCCTGGCTTAACAGAGAAGTTAAGCTTTCCCCTCCAATTGTCGAAAAGGACGTC540               AAGACTAAGGAGCCCTCCCTAGAACCAAAAATGTATAACAGCACCTGGGTAATTAATGCT600               CTCCAGTTCATCCAGGAATTTGGATATGATGGTAGTGGTGTTGTTGTTGCAGTACTTGAC660               ACGGGAGTTGATCCGAACCATCCTTTCTTGAGCATAACTCCAGATGGACGCAGGAAAATT720               ATAGAATGGAAGGATTTTACAGACGAGGGATTCGTGGATACATCATTCAGCTTTAGCAAG780               GTTGTAAATGGGACTCTTATAATTAACACAACATTCCAAGTGGCCTCAGGTCTCACGCTG840               AATGAATCGACAGGACTTATGGAATACGTTGTTAAGACTGTTTACGTGAGCAATGTGACC900               ATTGGAAATATCACTTCTGCTAATGGCATCTATCACTTCGGCCTGCTCCCAGAAAGATAC960               TTCGACTTAAACTTCGATGGTGATCAAGAGGACTTCTATCCTGTCTTATTAGTTAACTCC1020              ACTGGCAATGGTTATGACATTGCATATGTGGATACTGACCTTGACTACGACTTCACCGAC1080              GAAGTTCCACTTGGCCAGTACAACGTTACTTATGATGTTGCTGTTTTTAGCTACTACTAC1140              GGTCCTCTCAACTACGTGCTTGCAGAAATAGATCCTAACGGAGAATATGCAGTATTTGGG1200              TGGGATGGTCACGGTCACGGAACTCACGTAGCTGGAACTGTTGCTGGTTACGACAGCAAC1260              AATGATGCTTGGGATTGGCTCAGTATGTACTCTGGTGAATGGGAAGTGTTCTCAAGACTC1320              TATGGTTGGGATTATACGAACGTTACCACAGACACCGTGCAGGGTGTTGCTCCAGGTGCC1380              CAAATAATGGCAATAAGAGTTCTTAGGAGTGATGGACGGGGTAGCATGTGGGATATTATA1440              GAAGGTATGACATACGCAGCAACCCATGGTGCAGACGTTATAAGCATGAGTCTCGGTGGA1500              AATGCTCCATACTTAGATGGTACTGATCCAGAAAGCGTTGCTGTGGATGAGCTTACCGAA1560              AAGTACGGTGTTGTATTCGTAATAGCTGCAGGAAATGAAGGTCCTGGCATTAACATCGTT1620              GGAAGTCCTGGTGTTGCAACAAAGGCAATAACTGTTGGAGCTGCTGCAGTGCCCATTAAC1680              GTTGGAGTTTATGTTTCCCAAGCACTTGGATATCCTGATTACTATGGATTCTATTACTTC1740              CCCGCCTACACAAACGTTAGAATAGCATTCTTCTCAAGCAGAGGGCCGAGAATAGATGGT1800              GAAATAAAACCCAATGTAGTGGCTCCAGGTTACGGAATTTACTCATCCCTGCCGATGTGG1860              ATTGGCGGAGCTGACTTCATGTCTGGAACTTCGATGGCTACTCCACATGTCAGCGGTGTC1920              GTTGCACTCCTCATAAGCGGGGCAAAGGCCGAGGGAATATACTACAATCCAGATATAATT1980              AAGAAGGTTCTTGAGAGCGGTGCAACCTGGCTTGAGGGAGATCCATATACTGGGCAGAAG2040              TACACTGAGCTTGACCAAGGTCATGGTCTTGTTAACGTTACCAAGTCCTGGGAAATCCTT2100              AAGGCTATAAACGGCACCACTCTCCCAATTGTTGATCACTGGGCAGACAAGTCCTACAGC2160              GACTTTGCGGAGTACTTGGGTGTGGACGTTATAAGAGGTCTCTACGCAAGGAACTCTATA2220              CCTGACATTGTCGAGTGGCACATTAAGTACGTAGGGGACACGGAGTACAGAACTTTTGAG2280              ATCTATGCAACTGAGCCATGGATTAAGCCTTTTGTCAGTGGAAGTGTAATTCTAGAGAAC2340              AATACCGAGTTTGTCCTTAGGGTGAAATATGATGTAGAGGGTCTTGAGCCAGGTCTCTAT2400              GTTGGAAGGATAATCATTGATGATCCAACAACGCCAGTTATTGAAGACGAGATCTTGAAC2460              ACAATTGTTATTCCCGAGAAGTTCACTCCTGAGAACAATTACACCCTCACCTGGTATGAT2520              ATTAATGGTCCAGAAATGGTGACTCACCACTTCTTCACTGTGCCTGAGGGAGTGGACGTT2580              CTCTACGCGATGACCACATACTGGGACTACGGTCTGTACAGACCAGATGGAATGTTTGTG2640              TTCCCATACCAGCTAGATTATCTTCCCGCTGCAGTCTCAAATCCAATGCCTGGAAACTGG2700              GAGCTAGTATGGACTGGATTTAACTTTGCACCCCTCTATGAGTCGGGCTTCCTTGTAAGG2760              ATTTACGGAGTAGAGATAACTCCAAGCGTTTGGTACATTAACAGGACATACCTTGACACT2820              AACACTGAATTCTCAATTGAATTCAATATTACTAACATCTATGCCCCAATTAATGCAACT2880              CTAATCCCCATTGGCCTTGGAACCTACAATGCGAGCGTTGAAAGCGTTGGTGATGGAGAG2940              TTCTTCATAAAGGGCATTGAAGTTCCTGAAGGCACCGCAGAGTTGAAGATTAGGATAGGC3000              AACCCAAGTGTTCCGAATTCAGATCTAGACTTGTACCTTTATGACAGTAAAGGCAATTTA3060              GTGGCCTTAGATGGAAACCCAACAGCAGAAGAAGAGGTTGTAGTTGAGTATCCTAAGCCT3120              GGAGTTTATTCAATAGTAGTACATGGTTACAGCGTCAGGGACGAAAATGGTAATCCAACG3180              ACAACCACCTTTGACTTAGTTGTTCAAATGACCCTTGATAATGGAAACATAAAGCTTGAC3240              AAAGACTCGATTATTCTTGGAAGCAATGAAAGCGTAGTTGTAACTGCAAACATAACAATT3300              GATAGAGATCATCCTACAGGAGTATACTCTGGTATCATAGAGATTAGAGATAATGAGGTC3360              TACCAGGATACAAATACTTCAATTGCGAAAATACCCATAACTTTGGTAATTGACAAGGCG3420              GACTTTGCCGTTGGTCTCACACCAGCAGAGGGAGTACTTGGAGAGGCTAGAAATTACACT3480              CTAATTGTAAAGCATGCCCTAACACTAGAGCCTGTGCCAAATGCTACAGTGATTATAGGA3540              AACTACACCTACCTCACAGACGAAAACGGTACAGTGACATTCACGTATGCTCCAACTAAG3600              TTAGGCAGTGATGAAATCACAGTCATAGTTAAGAAAGAGAACTTCAACACATTAGAGAAG3660              ACCTTCCAAATCACAGTATCAGAGCCTGAAATAACTGAAGAGGACATAAATGAGCCCAAG3720              CTTGCAATGTCATCACCAGAAGCAAATGCTACCATAGTATCAGTTGAGATGGAGAGTGAG3780              GGTGGCGTTAAAAAGACAGTGACAGTGGAAATAACTATAAACGGAACCGCTAATGAGACT3840              GCAACAATAGTGGTTCCTGTTCCTAAGAAGGCCGAAAACATCGAGGTAAGTGGAGACCAC3900              GTAATTTCCTATAGTATAGAGGAAGGAGAGTACGCCAAGTACGTTATAATTACAGTGAAG3960              TTTGCATCACCTGTAACAGTAACTGTTACTTACACTATCTATGCTGGCCCAAGAGTCTCA4020              ATCTTGACACTTAACTTCCTTGGCTACTCATGGTACAGACTATATTCACAGAAGTTTGAC4080              GAATTGTACCAAAAGGCCCTTGAATTGGGAGTGGACAACGAGACATTAGCTTTAGCCCTC4140              AGCTACCATGAAAAAGCCAAAGAGTACTACGAAAAGGCCCTTGAGCTTAGCGAGGGTAAC4200              ATAATCCAATACCTTGGAGACATAAGACTATTACCTCCATTAAGACAGGCATACATCAAT4260              GAAATGAAGGCAGTTAAGATACTGGAAAAGGCCATAGAAGAATTAGAGGGTGAAGAGTAA4320              TCTCCAATTTTTCCCACTTTTTCTTTTATAACATTCCAAGCCTTTTCTTAGCTTCTTCGC4380              TCATTCTATCAGGAGTCCATGGAGGATCAAAGGTAAGTTCAACCTCCACATCTCTTACTC4440              CTGGGATTTCGAGTACTTTCTCCTCTACAGCTCTAAGAAGCCAGAGAGTTAAAGGACACC4500              CAGGAGTTGTCATTGTCATCTTTATATATACCGTTTTGTCAGGATTAATCTTTAGCTCAT4560              AAATTAATCCAAGGTTTACAACATCCATCCCAATTTCTGGGTCGATAACCTCCTTTAGCT4620              TTTCCAGAATCATTTCTTCAGTAATTTCAAGGTTCTCATCTTTGGTTTCTCTCACAAACC4680              CAATTTCAACCTGCCTGATACCTTCTAACTCCCTAAGCTTGTTATATATCTCCAAAAGAG4740              TGGCATCATCAATTTTCTCTTTAAA4765                                                 (2) INFORMATION FOR SEQ ID NO:9:                                              (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 1398 amino acids                                                  (B) TYPE: amino acid                                                          (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: peptide                                                   (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:                                       MetAsnLysLysGlyLeuThrValLeuPheIleAlaIleMetLeuLeu                              151015                                                                        SerValValProValHisPheValSerAlaGluThrProProValSer                              202530                                                                        SerGluAsnSerThrThrSerIleLeuProAsnGlnGlnValValThr                              354045                                                                        LysGluValSerGlnAlaAlaLeuAsnAlaIleMetLysGlyGlnPro                              505560                                                                        AsnMetValLeuIleIleLysThrLysGluGlyLysLeuGluGluAla                              65707580                                                                      LysThrGluLeuGluLysLeuGlyAlaGluIleLeuAspGluAsnArg                              859095                                                                        ValLeuAsnMetLeuLeuValLysIleLysProGluLysValLysGlu                              100105110                                                                     LeuAsnTyrIleSerSerLeuGluLysAlaTrpLeuAsnArgGluVal                              115120125                                                                     LysLeuSerProProIleValGluLysAspValLysThrLysGluPro                              130135140                                                                     SerLeuGluProLysMetTyrAsnSerThrTrpValIleAsnAlaLeu                              145150155160                                                                  GlnPheIleGlnGluPheGlyTyrAspGlySerGlyValValValAla                              165170175                                                                     ValLeuAspThrGlyValAspProAsnHisProPheLeuSerIleThr                              180185190                                                                     ProAspGlyArgArgLysIleIleGluTrpLysAspPheThrAspGlu                              195200205                                                                     GlyPheValAspThrSerPheSerPheSerLysValValAsnGlyThr                              210215220                                                                     LeuIleIleAsnThrThrPheGlnValAlaSerGlyLeuThrLeuAsn                              225230235240                                                                  GluSerThrGlyLeuMetGluTyrValValLysThrValTyrValSer                              245250255                                                                     AsnValThrIleGlyAsnIleThrSerAlaAsnGlyIleTyrHisPhe                              260265270                                                                     GlyLeuLeuProGluArgTyrPheAspLeuAsnPheAspGlyAspGln                              275280285                                                                     GluAspPheTyrProValLeuLeuValAsnSerThrGlyAsnGlyTyr                              290295300                                                                     AspIleAlaTyrValAspThrAspLeuAspTyrAspPheThrAspGlu                              305310315320                                                                  ValProLeuGlyGlnTyrAsnValThrTyrAspValAlaValPheSer                              325330335                                                                     TyrTyrTyrGlyProLeuAsnTyrValLeuAlaGluIleAspProAsn                              340345350                                                                     GlyGluTyrAlaValPheGlyTrpAspGlyHisGlyHisGlyThrHis                              355360365                                                                     ValAlaGlyThrValAlaGlyTyrAspSerAsnAsnAspAlaTrpAsp                              370375380                                                                     TrpLeuSerMetTyrSerGlyGluTrpGluValPheSerArgLeuTyr                              385390395400                                                                  GlyTrpAspTyrThrAsnValThrThrAspThrValGlnGlyValAla                              405410415                                                                     ProGlyAlaGlnIleMetAlaIleArgValLeuArgSerAspGlyArg                              420425430                                                                     GlySerMetTrpAspIleIleGluGlyMetThrTyrAlaAlaThrHis                              435440445                                                                     GlyAlaAspValIleSerMetSerLeuGlyGlyAsnAlaProTyrLeu                              450455460                                                                     AspGlyThrAspProGluSerValAlaValAspGluLeuThrGluLys                              465470475480                                                                  TyrGlyValValPheValIleAlaAlaGlyAsnGluGlyProGlyIle                              485490495                                                                     AsnIleValGlySerProGlyValAlaThrLysAlaIleThrValGly                              500505510                                                                     AlaAlaAlaValProIleAsnValGlyValTyrValSerGlnAlaLeu                              515520525                                                                     GlyTyrProAspTyrTyrGlyPheTyrTyrPheProAlaTyrThrAsn                              530535540                                                                     ValArgIleAlaPhePheSerSerArgGlyProArgIleAspGlyGlu                              545550555560                                                                  IleLysProAsnValValAlaProGlyTyrGlyIleTyrSerSerLeu                              565570575                                                                     ProMetTrpIleGlyGlyAlaAspPheMetSerGlyThrSerMetAla                              580585590                                                                     ThrProHisValSerGlyValValAlaLeuLeuIleSerGlyAlaLys                              595600605                                                                     AlaGluGlyIleTyrTyrAsnProAspIleIleLysLysValLeuGlu                              610615620                                                                     SerGlyAlaThrTrpLeuGluGlyAspProTyrThrGlyGlnLysTyr                              625630635640                                                                  ThrGluLeuAspGlnGlyHisGlyLeuValAsnValThrLysSerTrp                              645650655                                                                     GluIleLeuLysAlaIleAsnGlyThrThrLeuProIleValAspHis                              660665670                                                                     TrpAlaAspLysSerTyrSerAspPheAlaGluTyrLeuGlyValAsp                              675680685                                                                     ValIleArgGlyLeuTyrAlaArgAsnSerIleProAspIleValGlu                              690695700                                                                     TrpHisIleLysTyrValGlyAspThrGluTyrArgThrPheGluIle                              705710715720                                                                  TyrAlaThrGluProTrpIleLysProPheValSerGlySerValIle                              725730735                                                                     LeuGluAsnAsnThrGluPheValLeuArgValLysTyrAspValGlu                              740745750                                                                     GlyLeuGluProGlyLeuTyrValGlyArgIleIleIleAspAspPro                              755760765                                                                     ThrThrProValIleGluAspGluIleLeuAsnThrIleValIlePro                              770775780                                                                     GluLysPheThrProGluAsnAsnTyrThrLeuThrTrpTyrAspIle                              785790795800                                                                  AsnGlyProGluMetValThrHisHisPhePheThrValProGluGly                              805810815                                                                     ValAspValLeuTyrAlaMetThrThrTyrTrpAspTyrGlyLeuTyr                              820825830                                                                     ArgProAspGlyMetPheValPheProTyrGlnLeuAspTyrLeuPro                              835840845                                                                     AlaAlaValSerAsnProMetProGlyAsnTrpGluLeuValTrpThr                              850855860                                                                     GlyPheAsnPheAlaProLeuTyrGluSerGlyPheLeuValArgIle                              865870875880                                                                  TyrGlyValGluIleThrProSerValTrpTyrIleAsnArgThrTyr                              885890895                                                                     LeuAspThrAsnThrGluPheSerIleGluPheAsnIleThrAsnIle                              900905910                                                                     TyrAlaProIleAsnAlaThrLeuIleProIleGlyLeuGlyThrTyr                              915920925                                                                     AsnAlaSerValGluSerValGlyAspGlyGluPhePheIleLysGly                              930935940                                                                     IleGluValProGluGlyThrAlaGluLeuLysIleArgIleGlyAsn                              945950955960                                                                  ProSerValProAsnSerAspLeuAspLeuTyrLeuTyrAspSerLys                              965970975                                                                     GlyAsnLeuValAlaLeuAspGlyAsnProThrAlaGluGluGluVal                              980985990                                                                     ValValGluTyrProLysProGlyValTyrSerIleValValHisGly                              99510001005                                                                   TyrSerValArgAspGluAsnGlyAsnProThrThrThrThrPheAsp                              101010151020                                                                  LeuValValGlnMetThrLeuAspAsnGlyAsnIleLysLeuAspLys                              1025103010351040                                                              AspSerIleIleLeuGlySerAsnGluSerValValValThrAlaAsn                              104510501055                                                                  IleThrIleAspArgAspHisProThrGlyValTyrSerGlyIleIle                              106010651070                                                                  GluIleArgAspAsnGluValTyrGlnAspThrAsnThrSerIleAla                              107510801085                                                                  LysIleProIleThrLeuValIleAspLysAlaAspPheAlaValGly                              109010951100                                                                  LeuThrProAlaGluGlyValLeuGlyGluAlaArgAsnTyrThrLeu                              1105111011151120                                                              IleValLysHisAlaLeuThrLeuGluProValProAsnAlaThrVal                              112511301135                                                                  IleIleGlyAsnTyrThrTyrLeuThrAspGluAsnGlyThrValThr                              114011451150                                                                  PheThrTyrAlaProThrLysLeuGlySerAspGluIleThrValIle                              115511601165                                                                  ValLysLysGluAsnPheAsnThrLeuGluLysThrPheGlnIleThr                              117011751180                                                                  ValSerGluProGluIleThrGluGluAspIleAsnGluProLysLeu                              1185119011951200                                                              AlaMetSerSerProGluAlaAsnAlaThrIleValSerValGluMet                              120512101215                                                                  GluSerGluGlyGlyValLysLysThrValThrValGluIleThrIle                              122012251230                                                                  AsnGlyThrAlaAsnGluThrAlaThrIleValValProValProLys                              123512401245                                                                  LysAlaGluAsnIleGluValSerGlyAspHisValIleSerTyrSer                              125012551260                                                                  IleGluGluGlyGluTyrAlaLysTyrValIleIleThrValLysPhe                              1265127012751280                                                              AlaSerProValThrValThrValThrTyrThrIleTyrAlaGlyPro                              128512901295                                                                  ArgValSerIleLeuThrLeuAsnPheLeuGlyTyrSerTrpTyrArg                              130013051310                                                                  LeuTyrSerGlnLysPheAspGluLeuTyrGlnLysAlaLeuGluLeu                              131513201325                                                                  GlyValAspAsnGluThrLeuAlaLeuAlaLeuSerTyrHisGluLys                              133013351340                                                                  AlaLysGluTyrTyrGluLysAlaLeuGluLeuSerGluGlyAsnIle                              1345135013551360                                                              IleGlnTyrLeuGlyAspIleArgLeuLeuProProLeuArgGlnAla                              136513701375                                                                  TyrIleAsnGluMetLysAlaValLysIleLeuGluLysAlaIleGlu                              138013851390                                                                  GluLeuGluGlyGluGlu                                                            1395                                                                          (2) INFORMATION FOR SEQ ID NO:10:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 145 base pairs                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 2..145                                                          (xi) SEQUENCE DESCRIPTION: SEQ ID NO:10:                                      AGTTGCGGTAATTGACACGGGTATAGACGCGAACCACCCCGATCTG46                              ValAlaValIleAspThrGlyIleAspAlaAsnHisProAspLeu                                 151015                                                                        AAGGGCAAGGTCATAGGCTGGTACGACGCCGTCAACGGCAGGTCGACC94                            LysGlyLysValIleGlyTrpTyrAspAlaValAsnGlyArgSerThr                              202530                                                                        CCCTACGATGACCAGGGACACGGAACTCACGTNGCNGGAACNGTTGCT142                           ProTyrAspAspGlnGlyHisGlyThrHisValAlaGlyThrValAla                              354045                                                                        GGT145                                                                        Gly                                                                           (2) INFORMATION FOR SEQ ID NO:11:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 564 base pairs                                                    (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (ix) FEATURE:                                                                 (A) NAME/KEY: CDS                                                             (B) LOCATION: 1..564                                                          (xi) SEQUENCE DESCRIPTION: SEQ ID NO:11:                                      TCTCACGGAACTCACGTGGCGGGAACAGTTGCCGGAACAGGCAGCGTT48                            SerHisGlyThrHisValAlaGlyThrValAlaGlyThrGlySerVal                              505560                                                                        AACTCCCAGTACATAGGCGTCGCCCCCGGCGCGAAGCTCGTCGGTGTC96                            AsnSerGlnTyrIleGlyValAlaProGlyAlaLysLeuValGlyVal                              65707580                                                                      AAGGTTCTCGGTGCCGACGGTTCGGGAAGCGTCTCCACCATCATCGCG144                           LysValLeuGlyAlaAspGlySerGlySerValSerThrIleIleAla                              859095                                                                        GGTGTTGACTGGGTCGTCCAGAACAAGGATAAGTACGGGATAAGGGTC192                           GlyValAspTrpValValGlnAsnLysAspLysTyrGlyIleArgVal                              100105110                                                                     ATCAACCTCTCCCTCGGCTCCTCCCAGAGCTCCGACGGAGCCGACTCC240                           IleAsnLeuSerLeuGlySerSerGlnSerSerAspGlyAlaAspSer                              115120125                                                                     CTCAGTCAGGCCGTCAACAACGCCTGGGACGCCGGTATAGTAGTCTGC288                           LeuSerGlnAlaValAsnAsnAlaTrpAspAlaGlyIleValValCys                              130135140                                                                     GTCGCCGCCGGCAACAGCGGGCCGAACACCTACACCGTCGGCTCACCC336                           ValAlaAlaGlyAsnSerGlyProAsnThrTyrThrValGlySerPro                              145150155160                                                                  GCCGCCGCGAGCAAGGTCATAACCGTCGGTGCAGTTGACAGCAACGAC384                           AlaAlaAlaSerLysValIleThrValGlyAlaValAspSerAsnAsp                              165170175                                                                     AACATCGCCAGCTTCTCCAGCAGGGGACCGACCGCGGACGGAAGGCTC432                           AsnIleAlaSerPheSerSerArgGlyProThrAlaAspGlyArgLeu                              180185190                                                                     AAGCCGGAAGTCGTCGCCCCCGGCGTTGACATCATAGCCCCGCGCGCC480                           LysProGluValValAlaProGlyValAspIleIleAlaProArgAla                              195200205                                                                     AGCGGAACCAGCATGGGCACCCCGATAAACGACTACTACACCAAGGCC528                           SerGlyThrSerMetGlyThrProIleAsnAspTyrTyrThrLysAla                              210215220                                                                     TCTGGAACCTCAATGGCCACTCCCCATGTTACCGGT564                                       SerGlyThrSerMetAlaThrProHisValThrGly                                          225230235                                                                     (2) INFORMATION FOR SEQ ID NO:12:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 20 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:                                      GGCAAGGTCATAGGCTGGTA20                                                        (2) INFORMATION FOR SEQ ID NO:13:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 20 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13:                                      CCAGAACAAGGATAAGTACG20                                                        (2) INFORMATION FOR SEQ ID NO:14:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 20 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:                                      GGCACCCCGATAAACGACTA20                                                        (2) INFORMATION FOR SEQ ID NO:15:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 20 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:15:                                      ACGCCTATGTACTGGGAGTT20                                                        (2) INFORMATION FOR SEQ ID NO:16:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 20 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16:                                      CGTACTTATCCTTGTTCTGG20                                                        (2) INFORMATION FOR SEQ ID NO:17:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 20 base pairs                                                     (B) TYPE: nucleic acid                                                        (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17:                                      TGTAGTAGTCGTTTATCGGG20                                                        (2) INFORMATION FOR SEQ ID NO:18:                                             (i) SEQUENCE CHARACTERISTICS:                                                 (A) LENGTH: 237 amino acids                                                   (B) TYPE: amino acid                                                          (C) STRANDEDNESS: single                                                      (D) TOPOLOGY: linear                                                          (ii) MOLECULE TYPE: cDNA                                                      (xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:                                      AspLeuLysGlyLysValIleGlyTrpTyrAspAlaValAsnGlyArg                              151015                                                                        SerThrProTyrAspAspGlnGlyHisGlyThrHisValAlaGlyIle                              202530                                                                        ValAlaGlyThrGlySerValAsnSerGlnTyrIleGlyValAlaPro                              354045                                                                        GlyAlaLysLeuValGlyValLysValLeuGlyAlaAspGlySerGly                              505560                                                                        SerValSerThrIleIleAlaGlyValAspTrpValValGlnAsnLys                              65707580                                                                      AspXaaTyrGlyIleArgValIleAsnLeuSerLeuGlySerSerGln                              859095                                                                        SerSerAspGlyThrAspSerLeuSerGlnAlaValAsnAsnAlaTrp                              100105110                                                                     AspAlaGlyIleValValCysValAlaAlaGlyAsnSerGlyProAsn                              115120125                                                                     ThrTyrThrValGlySerProAlaAlaAlaSerLysValIleThrVal                              130135140                                                                     GlyAlaValAspSerAsnAspAsnIleAlaSerPheSerSerArgGly                              145150155160                                                                  ProThrAlaAspGlyArgLeuLysProGluValValAlaProGlyVal                              165170175                                                                     AspIleIleAlaProArgAlaSerGlyThrSerMetGlyThrProIle                              180185190                                                                     AsnAspTyrXaaAsnLysGlySerGlySerSerMetAspThrProHis                              195200205                                                                     ValSerGlyValGlyGlyLeuIleLeuGlnAlaHisProSerTrpThr                              210215220                                                                     ProAspLysValLysThrProSerSerArgProProThr                                       225230235                                                                     __________________________________________________________________________

What is claimed is:
 1. An isolated hyperthermostable protease geneoriginating in Pyrococcus furiosus.
 2. A hyperthermostable protease geneof claim 1 which encodes the amino acid sequence of SEQ ID NO: 1 or anenzymatically active fragment thereof.
 3. A hyperthermostable proteasegene of claim 1 which comprises the nucleotide sequence represented bythe SEQ ID NO 2 in the Sequence Listing.
 4. A hyperthermostable proteasegene which is the hybridizable with the hyperthermostable protease geneof claim 2 or DNA selected from the group consisting of the nucleotidesequences represented by SEQ ID NO 3, 4, 5 and 6 in the Sequence Listingwhich are part of the hyperthermostable protease gene of claim
 1. 5. Ahyperthermostable protease gene of claim 4 which comprises thenucleotide sequence represented by the SEQ ID NO
 7. 6. A process forproducing a hyperthermostable protease which comprises culturing atransformant transformed with a plasmid into which the hyperthermostableprotease gene of claim 1 has been transduced, and collecting thehyperthermostable protease from the culture.
 7. A process for producinga hyperthermostable protease which comprises culturing a transformanttransformed with a plasmid into which the hyperthermostable proteasegene of claim 4 has been transduced, and collecting thehyperthermostable protease from the culture.